Gabriel Lugo 
Differential 


Geometry in 
Physics 


Differential 
Geometry in 


Physics 


Gabriel Lugo 


Copyright ©2021 Gabriel Lugo 


This work is licensed under a Creative Commons CC BY-NC-ND license. To 
view a copy of the license, visit http://creativecommons.org/licenses. 


Suggested citation: Lugo, Gabriel. Differential Geometry in Physics. Wilm- 
ington. University of North Carolina Wilmington William Madison Randall 
Library, 2021. 

doi: https: //doi.org/10.5149/9781469669267_Lugo 


ISBN 978-1-4696-6924-3 (cloth: alk. paper) 
ISBN 978-1-4696-6925-0 (paperback: alk. paper) 
ISBN 978-1-4696-6926-7 (open access ebook) 


Cover illustration: The Hopf fibration. 
Published by the UNC Wilmington William Madison Randall Library 


Distributed by UNC Press 
www.uncpress.org 


Publication of this book was supported by a grant from the Thomas W. Ross 
Fund from the University of North Carolina Press. 


Escher detail on following page from “Belvedere” plate 230, [1]. All other figures 
and illustrations were produced by the author or taken from photographs taken 
by the author and family members. Exceptions are included with appropriate 
attribution. 


iii 


This book is dedicated to my family, for without their love and support, this 
work would have not been possible. Most importantly, let us not forget that 
our greatest gift are our children and our children’s children, who stand as a 
reminder, that curiosity and imagination is what drives our intellectual pursuits, 
detaching us from the mundane and confinement to Earthly values, and bringing 
us closer to the divine and the infinite. 


G. Lugo (2021) 


iv 


Contents 


Preface 


1 Vectors and Curves 


Lily Tangent Vectors: -nira e ty Bah te the et te ae ae ed 
1.2 Differentiable Maps. .................2.20.2.0004 
Tis: Curves ied Ade ne a te Se i IRE dad oid te a 
1.3.1 Parametric Curves ........0.0.00 0000000000 
132" Velocity. a Oh ays, hd PSD ee oe ad 
3:3- -Frenet Frames? se 4“ te oe EK A res Be hee A 
1.4 Fundamental Theorem of Curves ............0..0004 
TAs: -isometries ors. oe ke Pre, Que ee BR AG Sud ee ewe a 
1.4.2 Natural Equations ................2..000. 


2 Differential Forms 


2) One Forms sae oed 305, Bis kOe oe SB Ae Ph ed a ees 
2:2) “PensOrs: 28 ike Ae ete es toe he BL I ee ewe eed A 
22-1. “Tensor: Products: = s ru aokoe SR Ee be Re a 
22:2 Inner Productnnc: a aii hele A REE A ee, RR Sas 
2.2.3 Minkowski Space .................2.-2000.4 
2.2.4 Wedge Products and 2-Forms................ 
2:225... Determinants’ © Seher te do oo hh ae ee Nf lal a 
2.2.6 Vector Identities .... 2... 20.0002... 0 000022 
22l SMH ORIMNS rit ke Sota Se ae Se a bh hw he A oh gh A EE OR hy le, 
2.3 Exterior Derivatives .. 2... 0.0.0.0... ee 
Zoek. Pilleback= sree tee thd Bia WER oe ok Ro te 8s 
2.3.2 Stokes’ Theorem in R” ..................4. 
2.4 The Hodge x Operator ............... 0.000000. 
IAT DuakForms ra ss Siete a a a E Ge Bee 
24:2 Laplacian te erea dhe gi ad hed eo tee eH LG 
2.4.3 Maxwell Equations...................00. 


3 Connections 


Sul Frames a S a ky aar AeA Oh ea ME cae NS oS ko a ea 
3.2 Curvilinear Coordinates ..........0.0..0...0. 0000084 
3:3 ‘Covariant.Derivative s %& 24.0 acc eS Mw woe ed a 


vi 


CONTENTS 

3.4 Cartan Equations . 24.0.4 4 2 64s eb 6 oa Be ESS 87 
Theory of Surfaces 93 
Asli (Manitolds)-s :s eee ei Ae ee ee RN we a a a AE A A 93 
4.2 The First Fundamental Form ..................0.. 97 
4.3 The Second Fundamental Form ................... 106 
AA Curvature xs a es tee SRI Eo EY PA a n 112 
4.4.1 Classical Formulation of Curvature. ............ 112 
4.4.2 Covariant Derivative Formulation of Curvature ...... 114 

4.5 Fundamental Equations ...................000. 120 
4.5.1 Gauss-Weingarten Equations ................ 120 
4.5.2 Curvature Tensor, Gauss’s Theorema Egregium..... . 126 
Geometry of Surfaces 136 
5.1 Surfaces of Constant Curvature ................2.00. 136 
5.1.1 Ruled and Developable Surfaces .............. 136 
5.1.2 Surfaces of Constant Positive Curvature .......... 140 
5.1.3 Surfaces of Constant Negative Curvature ......... 143 
5.1.4 Backlund Transforms .................... 147 

5.2 Minimal Surfaces... . 2... 2 ee 158 
5.2.1 Minimal Area Property ................00- 158 
5.2.2 Conformal Mappings..................-0. 163 
5.2.3 Isothermal Coordinates ..................4. 166 
5.2.4 Stereographic Projection. ...............00. 169 
5.2.5 Minimal Surfaces by Conformal Maps ........... 173 
Riemannian Geometry 185 
6.1 Riemannian Manifolds ..........0....0.0. 0.00000. 185 
6:2 “Subimanifolds: s a è slid, eek wee ee Sa el Bote E A oe Meee de ged 188 
6.3 Sectional Curvature .. ooo a a a a a 196 
64 Bie Di arees eta eae ar a a A aa heb planar aa an inaa 199 
6.4.1 Linear Connections . . . .............00 0004 200 
6.4.2 Affine Connections .......... 0.000. eee 201 
6.4.3 Exterior Covariant Derivative ................ 203 
6:4:4 "Parallelism <2 8.5. 4A ar dee een oe ee 206 

6.5 Lorentzian Manifolds. ........0.0.0. 020.000.0000 0008.4 208 
6.6 (Geodesics -oaro ae a a ee ee ae a i 213 
6.7 Geodesics in GR ... 2... a a 218 
6.8 Gauss-Bonnet Theorem .. aoaaa a a a ae 227 
Groups of Transformations 233 
Galle Die Groups- rerem hee, e ea EE SH AD LS 233 
7.1.1 One-Parameter Groups of Transformations ........ 236 
7.1.2 Lie Derivatives .. . aoaaa es 240 

1.2.) Wie Al SOD as! sae Yeas ace FE a ER eee 251 
7.2.1 The Exponential Map .................-2.. 255 


7.2.2 The Adjonmt Map..................2.0.0.0. 258 


CONTENTS 


7.2.3 The Maurer-Cartan Form .................. 
7.24 Cartan Subalgebra..................02-. 


7.3 Transformation Groups 


8 Classical Groups in Physics 


8.1 Orthogonal Groups . . 
8.1.1 Rotations in R? 
8.1.2 Rotations in R? 
8.1.3 SU(2) ..... 


8214, Hopf Ribration: «4.4.24 gai poe dean e be Agee ee ate 
8.1.5 Angular Momentum sa èa dessa naaa ka y e 
&:2 Lorentz Group: penare a a eae RD a eee Boe AEG 
8.2.1 Infinitesimal Transformations ................ 
D22 -Splors: tt La eo e h aea ee Eh Ae a 
8.3 N-P Formalism, i: da a e pe a e E a ee es 
8.3.1 The Kerr Metric 7e o se a da o Ae Aa p i ESES 
8:3:2 Eth Operator oeaan a a ee ee a G a 
gA: SUSI oe bak i te the Ao aah eea aM tes Adal aa, Sind tel 


9 Bundles and Applications 


9.1 Fiber Bundles..... 
9.2 Principal Fiber Bundles 
9.3 Connections on PFB’s 


9.3.1 Ehresmann Connection ..............0.00084 


9.3.2 Horizontal Lift 
9.3.3 Curvature Form 
9.4 Gauge Fields ..... 
9.4.1 Electrodynamics 
9.4.2 Dirac Monopole 
9.4.3. BPST Instanton 


References 


Index 


vii 


261 
264 
264 


269 
269 
269 
271 
272 
278 
286 
294 
295 
297 
304 
308 
313 
315 


319 
319 
325 
332 
333 
337 
338 
339 
340 
342 
348 


350 


354 


viii CONTENTS 


Preface 


These notes were developed as part of a course on differential geometry which 
the author has taught for many years at UNCW. The first five chapters plus 
chapter six, constitute the foundation of the three-hour course. The course is 
cross-listed at the level of seniors and first year graduate students. In addition to 
applied mathematics majors, the class usually attracts a good cohort of double 
majors in mathematics and physics. Material from other chapters have inspired 
a number of honors and master level theses. This book should be accessible to 
students who have completed traditional training in advanced calculus, linear 
algebra, and differential equations. Students who master the entirety of this 
material will have gained insight on very powerful tools in mathematical physics 
at the graduate level. 

There are many excellent texts in differential geometry but very few have 
an early introduction to differential forms and their applications to physics. It 
is the purpose of these notes to: 


1. Provide a bridge between the very practical formulation of classical differ- 
ential geometry created by early masters of the late 1800’s, and the more 
elegant but less intuitive modern formulation in terms of manifolds, bun- 
dles and differential forms. In particular, the central topic of curvature is 
presented in three different but equivalent formalisms. 


2. Present the subject of differential geometry with an emphasis on making 
the material readable to physicists who may have encountered some of 
the concepts in the context of classical or quantum mechanics, but wish 
to strengthen the rigor of the mathematics. A source of inspiration for 
this goal is rooted in the shock to this author as a graduate student in 
the 70’s at Berkeley, at observing the gasping failure of communications 
between the particle physicists working on gauge theories and differential 
geometers working on connection on fiber bundles. They seemed to be 
completely unaware at the time, that they were working on the same 
subject. 


3. Make the material as readable as possible for those who stand at the 
boundary between theoretical physics and applied mathematics. For this 
reason, it will be occasionally necessary to sacrifice some mathematical 
rigor or depth of physics, in favor of ease of comprehension. 


ix 


4. Provide the formal geometrical background for the mathematical theory 
of general relativity. 


5. Introduce examples of other applications of differential geometry to physics 
that might not appear in traditional texts used in courses for mathematics 
students. For example, several students at UNCW have written masters’ 
theses in the theory of solitons, but usually they have followed the path 
of Lie symmetries in the style of Olver. We hope that the elegance of 
Backlund transforms will attract students to a geometric approach to the 
subject. The book is also a stepping stone to other interconnected ar- 
eas of mathematics such as representation theory, complex variables and 
algebraic topology. 


G. Lugo (2021) 


Chapter 1 


Vectors and Curves 


1.1 Tangent Vectors 


1.1.1 Definition Euclidean n-space R” is defined as the set of ordered n- 
tuples p(p!,..., p”), where pê € R, for each i = 1,...,n. We may associate 
a position vector p = (p!,...,p”) with any given point a point p in n-space. 
Given any two n-tuples p = (p',...,p”), q = (q',...,q”) and any real number 
c, we define two operations: 


p+q = (p'+q',...,p" +4"), (1.1) 
cp = (ep,...,cp°). 


These two operations of vector sum and multiplication by a scalar satisfy all 
the 8 properties needed to give the set V = R” a natural structure of a vector 
space. It is common to use the same notation R” for the space of n-tuples and 
for the vector space of position vectors. Technically, we should write p € R” 
when we think of R” as a metric space and p € R” when we think of it as 
vector space, but as most authors, we will freely abuse the notation. ! 


1.1.2 Definition Let xt be the real valued functions in R” such that 


r'(p) =p 
for any point p = (p',...,p"). The functions xê are then called the natural 
coordinate functions. When convenient, we revert to the usual names for the 
coordinates, x! = x, x? = y and z? = z in R3. A small awkwardness might 


lIn these notes we will use the following index conventions: 
e In R”, indices such as i, j, k,l, m,n, run from 1 to n. 
e In space-time, indices such as u,v, p,o, run from 0 to 3. 
e On surfaces in R3, indices such as a, 6, y, ô, run from 1 to 2. 


e Spinor indices such as A, B, A, B run from 1 to 2. 


2 CHAPTER 1. VECTORS AND CURVES 


occur in the transition to modern notation. In classical vector calculus, a point 
in R” is often denoted by x, in which case, we pick up the coordinates with the 
slot projection functions u? : R” — R. defined by 


u'(x) = a". 


1.1.3 Definition A real valued function in R” is of class C” if all the partial 
derivatives of the function up to order r exist and are continuous. The space 
of infinitely differentiable (smooth) functions will be denoted by C™(R”) or 
F(R”). 


1.1.4 Definition Let V and V’ be finite dimensional vector spaces such as 
V = R* and V’ = R”, and let L(V, V’) be the space of linear transformations 
from V to V’. The set of linear functionals L(V,R) is called the dual vector 
space V*. This space has the same dimension as V. 

In calculus, vectors are usually regarded as arrows characterized by a direc- 
tion and a length. Thus, vectors are considered as independent of their location 
in space. Because of physical and mathematical reasons, it is advantageous to 
introduce a notion of vectors that does depend on location. For example, if the 
vector is to represent a force acting on a rigid body, then the resulting equations 
of motion will obviously depend on the point at which the force is applied. In 
later chapters, we will consider vectors on curved spaces; in these cases, the 
positions of the vectors are crucial. For instance, a unit vector pointing north 
at the earth’s equator is not at all the same as a unit vector pointing north 
at the tropic of Capricorn. This example should help motivate the following 
definition. 


1.1.5 Definition A tangent vector X, in R”, is an ordered pair {x, p}. We 
may regard x as an ordinary advanced calculus “arrow-vector” and p is the 
position vector of the foot of the arrow. 

The collection of all tangent vectors at a point p € R” is called the tangent 
space at p and will be denoted by T,(R"). Given two tangent vectors Xp, Yp 
and a constant c, we can define new tangent vectors at p by (X +Y )p=Xp + Yp 
and (cX), = cXp. With this definition, it is clear that for each point p, the 
corresponding tangent space T,(R”) at that point has the structure of a vector 
space. On the other hand, there is no natural way to add two tangent vectors 
at different points. 

The set T(R”) (or simply TR”) consisting of the union of all tangent spaces 
at all points in R” is called the tangent bundle. This object is not a vector space, 
but as we will see later it has much more structure than just a set. 


1.1.6 Definition A vector field X in U C R” is a section of the tangent 
bundle, that is, a smooth function from U to T(U). The space of sections 
T(T(U) is also denoted by & (U). 

The difference between a tangent vector and a vector field is that in the 
latter case, the coefficients v' of x are smooth functions of a’. Since in general 


1.1. TANGENT VECTORS 3 


TM = TR" 


Fig. 1.1: Tangent Bundle 


there are not enough dimensions to depict a tangent bundle and vector fields 
as sections thereof, we use abstract diagrams such as shown Figure 1.1. In such 
a picture, the base space M (in this case M = R”) is compressed into the 
continuum at the bottom of the picture in which several points p;,...,pxz are 
shown. To each such point one attaches a tangent space. Here, the tangent 
spaces are just copies of R” shown as vertical “fibers” in the diagram. The 
vector component x, of a tangent vector at the point p is depicted as an arrow 
embedded in the fiber. The union of all such fibers constitutes the tangent 
bundle TM = TR”. A section of the bundle amounts to assigning a tangent 
vector to every point in the base. It is required that such assignment of vectors 
is done in a smooth way so that there are no major “changes” of the vector 
field between nearby points. 
Given any two vector fields X and Y and any 


smooth function f, we can define new vector fields ae a Perens ae 
X +Y and fX by EKI DAA 
OL LEARNS 
(X+¥V)p = X% +% a2) LLARA” DA 
j i EiT] 
so that 2 (U) has the structure of a vector space * : y | esc a : / 
over R. The subscript notation X, indicating the n a a AA 4 
location of a tangent vector is sometimes cum- — * * SS “| 77,7, 7 


bersome, but necessary to distinguish them from 

vector fields. Fig. 1.2: Vector Field 
Vector fields are essential objects in physical 

applications. If we consider the flow of a fluid in 

a region, the velocity vector field represents the 

speed and direction of the flow of the fluid at that point. Other examples of 

vector fields in classical physics are the electric, magnetic, and gravitational 

fields. The vector field in figure 1.2 represents a magnetic field around an 

electrical wire pointing out of the page. 


1.1.7 Definition Let X, = {x, p} be a tangent vector in an open neighbor- 
hood U of a point p € R” and let f be a C% function in U. The directional 


4 CHAPTER 1. VECTORS AND CURVES 


derivative of f at the point p, in the direction of x, is defined by 


X (f) = VF) x, (1.3) 
where V f(p) is the gradient of the function f at the point p. The notation 
X(f) = Yx, f, 


is also commonly used. This notation emphasizes that, in differential geometry, 
we may think of a tangent vector at a point as an operator on the space of 
smooth functions in a neighborhood of the point. The operator assigns to a 
function f, the directional derivative of that function in the direction of the 
vector. Here we need not assume as in calculus that the direction vectors have 
unit length. 

It is easy to generalize the notion of directional derivatives to vector fields 
by defining 


X(f)=Vxf=Vf-x, (1.4) 
where the function f and the components of x depend smoothly on the points 
of R”. 

The tangent space at a point p in R” can be envisioned as another copy of 
R” superimposed at the point p. Thus, at a point p in R?, the tangent space 
consist of the point p and a copy of the vector space R? attached as a “tangent 
plane” at the point p. Since the base space is a flat 2-dimensional continuum, 
the tangent plane for each point appears indistinguishable from the base space 
as in figure 1.2. 

Later we will define the tangent space for a curved continuum such as a 
surface in R3 as shown in figure 1.3. In this case, the tangent space at a point 
p consists of the vector space of all vectors actually tangent to the surface at 
the given point. 


Fig. 1.3: Tangent vectors Xp, Y, on a surface in R°. 


1.1.8 Proposition If f,g € F(R”), a,b € R, and X € X (R”) is a vector 
field, then 


X(af+bg) = aX(f)+bX(9), (1.5) 
X(fg) = fX(g)+9X(f). 


1.1. TANGENT VECTORS 5 


1.1.9 Remark The space of smooth functions is a ring, ignoring a small 
technicality with domains. An operator such as a vector field with the properties 
above, is called a linear derivation on F(R”). 

Proof First, let us develop an mathematical expression for tangent vectors and 
vector fields that will facilitate computation. 

Let p € U be a point and let xt be the coordinate functions in U. Suppose that 
Xp = {x,p}, where the components of the Euclidean vector x are (v',...,v"). 
Then, for any function f, the tangent vector X, operates on f according to the 


formula 
an= (34) ow. (1.6) 


i=l 


It is therefore natural to identify the tangent vector X, with the differential 
operator 


Xp 


p 
o 0 
= 1 bow. tay? 
Xp = v (ax) 4 Hu (am). 


Notation: We will be using Einstein’s convention to suppress the summation 
symbol whenever an expression contains a repeated index. Thus, for example, 
the equation above could be simply written as 


Xy =o E (1.8) 


This equation implies that the action of the vector X, on the coordinate func- 
tions x’ yields the components v’ of the vector. In elementary treatments, 
vectors are often identified with the components of the vector, and this may 
cause some confusion. 


The operators 
o o 
{e1,.--,€n}|lp = { (ser) (sz) 


form a basis for the tangent space T,(R”) at the point p, and any tangent vector 
can be written as a linear combination of these basis vectors. The quantities 
vt are called the contravariant components of the tangent vector. Thus, for 
example, the Euclidean vector in R3 


(1.7) 


| 
M4: 
S 
ATN 
Flo 
xv 


x = 3i + 4j — 3k 


located at a point p, would correspond to the tangent vector 


ORORO 


6 CHAPTER 1. VECTORS AND CURVES 


Let X = a be an arbitrary vector field and let f and g be real-valued 
x? 
functions. Then 


0 
Dai (af + bg) 


; 0 ; 0 
= valf) to Dai (O9) 
; OF i 3g 
ere Ox* 


au a 
= aX(f)+bX(9). 


v’ 


X(af + bg) 


II 


Similarly, 


XU) = Aao) 


= vif g 


Pe) 
qi (9) + U9 a GS) 


= i O9 iof 
S Oxt ie Oxt 


= fX(g)+9X(f). 


To re-emphasize, any quantity in Euclidean space which satisfies relations 1.5 
is a called a linear derivation on the space of smooth functions. The word linear 
here is used in the usual sense of a linear operator in linear algebra, and the 
word derivation means that the operator satisfies Leibnitz’ rule. 

The proof of the following proposition is slightly beyond the scope of this 
course, but the proposition is important because it characterizes vector fields 
in a coordinate-independent manner. 


1.1.10 Proposition Any linear derivation on F(R”) is a vector field. 

This result allows us to identify vector fields with linear derivations. This 
step is a big departure from the usual concept of a “calculus” vector. To a 
differential geometer, a vector is a linear operator whose inputs are functions 
and whose output are functions that at each point represent the directional 
derivative in the direction of the Euclidean vector. 


1.1.11 Example Given the point p(1,1), the Euclidean vector x = (3,4), 
and the function f(x,y) = x? + y?, we associate x with the tangent vector 


Then, 


1.2. DIFFERENTIABLE MAPS 7 
1.1.12 Example Let f(x,y,z) = ry?z° and x = (32, 2y,z). Then 
3x of + 2 2i z of 
arj” Oy Oz 
3a(y?z?) + 2y(2Qryz*) + z(3ay"z7), 
= 8ay?z? + dary?z? + 3ry?2? = 10zy?z°. 


X(f) 


II 


II 


1.1.13 Definition Let X be a vector field in R” and p be a point. A curve 
a(t) with a(0) = p is called an integral curve of X if a' (0) = Xp, and, whenever 
a(t) is the domain of the vector field, a’(t) = X act). 

In elementary calculus and differential equations, the families of integral 
curves of a vector field are called the streamlines, suggesting the trajectories 
of a fluid with velocity vector X. In figure 1.2, the integral curves would be 
circles that fit neatly along the flow of the vector field. In local coordinates, 
the expression defining integral curves of X constitutes a system of first order 
differential equations, so the existence and uniqueness of solutions apply locally. 
We will treat this in more detail in subsection 7.1.1 


1.2 Differentiable Maps 


1.2.1 Definition Let F : R” > R” be a vector function defined by coor- 
dinate entries F(p) = (f'(p), f?(p),...f’(p)). The vector function is called 
a mapping if the coordinate functions are all differentiable. If the coordinate 
functions are C®, F is called a smooth mapping. If (#!,a?,...2”) are local 
coordinates in R” and (y', y?,...y™) local coordinates in R™, a map y = F(x) 
is represented in advanced calculus by m functions of n variables 


y = f(a’), i=1...n, j=1...m. (1.9) 


A map F : R” > R” is differentiable at a point p € R” if there exists a linear 
transformation DF(p): R” —> R™ such that 


m ZŒ +h) -— F(p) — DF(p)(h)| 


=0 1.10 
h->0 h| ( ) 


The linear transformation DF(p)is called the Jacobian. A differentiable map 
that is invertible and the inverse is differentiable, is called a diffeomorphism. 


Remarks 


1. A differentiable mapping F : J € R — R” is what we called a curve. If 
t € I = [a,b], the mapping gives a parametrization x(t), as we discussed 
in the previous section. 


2. A differentiable mapping F : R € R” > R” is called a coordinate trans- 
formation. Thus, for example, the mapping F : (u,v) € R? > (x,y) 


8 CHAPTER 1. VECTORS AND CURVES 


R’, given by functions x = z(u,v), y = y(u, v), would constitute a change 
of coordinates from (u,v) to (x,y). The most familiar case is the polar 
coordinates transformation x = r cos, y=rsiné. 


3. A differentiable mapping F : R € R? — R? is what in calculus we 
called a parametric surface. Typically, one assumes that R is a simple 
closed region, such as a rectangle. If one denotes the coordinates in R? 
by (u,v) € R, and x € R3, the parametrization is written as x(u,v) = 
(x(u,v), y(u, v), z(u,v)). The treatment of surfaces in R3 is presented in 
chapter 4. If R? is replaced by R”, the mapping locally represents a 
2-dimensional surface in a space of n dimensions. 


For each point p € R”, we say that the Jacobian induces a linear trans- 
formation F, from the tangent space T,R” to the tangent space Tpo)R™. In 
differential geometry we this Jacobian map is also called the push-forward. If 
we let X be a tangent vector in R”, then the tangent vector FX in R” is 
defined by 

P,X(f) =X(foF), (1.11) 


where f € F(R”). (See figure 1.4) 


verm £4 7%, .R™ sex 


| | 


Rr” E 5 R™ i 5 R 


Fig. 1.4: Jacobian Map. 


As shown in the diagram, F.X (f) is evaluated at F'(p) whereas X is evalu- 
ated at p. So, to be precise, equation 1.11 should really be written as 


F.X(f)(F(p)) = X(f © F)(p), (1.12) 
F,X(f)oF=X(foF), (1.13) 


As we have learned from linear algebra, to find a matrix representation of a 
linear map in a particular basis, one applies the map to the basis vectors. If 
we denote by { 7} the basis for the tangent space at a point p € R” and by 
{aor} the basis for the tangent space at the corresponding point F(p) € R™ 


with coordinates given by y? = fÍ (xŻ), the push-forward definition reads, 


o O 
_ Of oy 
— Oys Axt? 
ð, dyi a 
Fela) = Bat Dy 


1.2. DIFFERENTIABLE MAPS 9 


In other words, the matrix representation of F, in standard basis is in fact the 
Jacobian matrix. In classical notation, we simply write the Jacobian map in 
the familiar form, 
ð dy) O 
Ox? — Ox? Oys 


(1.14) 


1.2.2 Theorem If F : R” — R” and G : R” — R? are mappings, then 
(GoF), =G, o Fy. 
Proof Let X € T,(R)”, and f be a smooth function f : RP > R. Then, 


(GoF).(X)(f) = X(fo(GoF), 


1.2.3 Inverse Function Theorem. When m = n, mappings are called 
change of coordinates. In the terminology of tangent spaces, the classical in- 
verse function theorem states that if the Jacobian map F, is a vector space 
isomorphism at a point, then there exists a neighborhood of the point in which 
F is a diffeomorphism. 


1.2.4 Remarks 


1. Equation 1.14 shows that under change of coordinates, basis tangent vec- 
tors and by linearity all tangent vectors transform by multiplication by 
the matrix representation of the Jacobian. This is the source of the almost 
tautological definition in physics, that a contravariant tensor of rank one, 
is one that transforms like a contravariant tensor of rank one. 


2. Many authors use the notation dF to denote the push-forward map F». 


3. If F: R” — R” and G: R™ — R? are mappings, we leave it as an 
exercise for the reader to verify that the formula (Go F), = G, o Fx 
for the composition of linear transformations corresponds to the classical 
chain rule. 


4. As we will see later, the concept of the push-forward extends to manifold 
mappings F : M > N. 


10 CHAPTER 1. VECTORS AND CURVES 


1.3 Curves in R 


1.3.1 Parametric Curves 


1.3.1 Definition A curve a(t) in R is a C% map from an interval I C 
R into RÌ. The curve assigns to each value of a parameter t € R, a point 
(a*(t),a7(t),a°(t)) € RY. 


ICR + R 
t ++ a(t) = (a! (t) a? (t), a? (t)) 


One may think of the parameter t as representing time, and the curve a as 
representing the trajectory of a moving point particle as a function of time. 
When convenient, we also use classical notation for the position vector 


x(t) = (xt (t), £? (t), x3(t)), (1.15) 


which is more prevalent in vector calculus and elementary physics textbooks. 
Of course, what this notation really means is 


r(t) = (uê o a) (t), (1.16) 


where u’ are the coordinate slot functions in an open set in R3 


1.3.2 Example Let 


a(t) e (aıt H bı, aot } bg, azt ł bs). (1.17) 


This equation represents a straight line passing through the point p = (b1, b2, b3), 
in the direction of the vector v = (a1, a2, a3). 


1.3.3 Example Let 
a(t) = (acoswt, asinwt, bt). (1.18) 


This curve is called a circular helix. Geometrically, we may view the curve as the 
path described by the hypotenuse of a triangle with slope b, which is wrapped 
around a circular cylinder of radius a. The projection of the helix onto the 
xy-plane is a circle and the curve rises at a constant rate in the z-direction 
(See Figure 1.5a). Similarly, the equation a(t) = (a cosh wt, a sinh wt, bt) is 
called a hyperbolic “helix.” It represents the graph of curve that wraps around 
a hyperbolic cylinder rising at a constant rate. 


1.3.4 Example Let 


a(t) = (a(1 + cost), asin t, 2asin(t/2)). (1.19) 


This curve is called the Temple of Viviani. Geometrically, this is the curve 
of intersection of a sphere x? + y? + z2? = 4a? of radius 2a, and the cylinder 


1.3. CURVES IN R? 11 


a) b) 


Fig. 1.5: a) Circular Helix. b) Temple of Viviani 


x? +y? = 2ax of radius a, with a generator tangent to the diameter of the 
sphere along the z-axis (See Figure 1.5b). 


The Temple of Viviani is of historical interest in the development of calculus. 
The problem was posed anonymously by Viviani to Leibnitz, to determine on 
the surface of a semi-sphere, four identical windows, in such a way that the 
remaining surface be equivalent to a square. It appears as if Viviani was chal- 
lenging the effectiveness of the new methods of calculus against the power of 
traditional geometry. 

It is said that Leibnitz understood the nature 
of the challenge and solved the problem in one 
day. Not knowing the proposer of the enigma, 
he sent the solution to his Serenity Ferdinando, 
as he guessed that the challenge must have orig- 
inated from prominent Italian mathematicians. 
Upon receipt of the solution by Leibnitz, Viviani 
posted a mechanical solution without proof. He 
described it as using a boring device to remove 
from a semisphere, the surface area cut by two 
cylinders with half the radius, and which are tan- 
gential to a diameter of the base. Upon realizing this could not physically be 
rendered as a temple since the roof surface would rest on only four points, 
Viviani no longer spoke of a temple but referred to the shape as a “sail.” 


1.3.5 Definition Let a: I — R? be a curve in R3 given in components as 


above a = (a',a’,a%). For each point t € I we define the velocity or tangent 


vector of the curve by 
da! da? da? 
= | Se : 1.20 
al (t) ee (1.20) 


At each point of the curve, the velocity vector is tangent to the curve and thus 
the velocity constitutes a vector field representing the velocity flow along that 
curve. In a similar manner the second derivative a” (t) is a vector field called 


12 CHAPTER 1. VECTORS AND CURVES 


the acceleration along the curve. The length v = ||a’(t)|| of the velocity vector 
is called the speed of the curve. The classical components of the velocity vector 
are simply given by 


1 2 43 
=$ dx =): (1.21) 


dt’ dt’ dt 


and the speed is 


dx! \? dx?\? dr? \* 
= e T (29) 
The notation T(t) or T,(t) is also used for the tangent vector a’(t), but for now, 
we reserve T(t) for the unit tangent vector to be introduced in section 1.3.3 on 
Frenet frames. 

As is well known, the vector form of the equa- 
tion of the line 1.17 can be written as x(t) = 
p + tv, which is consistent with the Euclidean 
axiom stating that given a point and a direction, 
there is only one line passing through that point 
in that direction. In this case, the velocity x = v 
is constant and hence the acceleration X = 0. 
This is as one would expect from Newton’s law 
of inertia. 

The differential dx of the position vector given by 


V(t) 


X(t) 


(1.23) 


1 dz? dx 
dx = (dx', dx”, dz?) = E SEN cas ) 


dt’ dt’ dt 


which appears in line integrals in advanced calculus is some sort of an infinitesi- 
mal tangent vector. The norm ||dx|| of this infinitesimal tangent vector is called 
the differential of arc length ds. Clearly, we have 


ds = |jdx|| = v dt. (1.24) 


If one identifies the parameter t as time in some given units, what this says 
is that for a particle moving along a curve, the speed is the rate of change of 
the arc length with respect to time. This is intuitively exactly what one would 
expect. 

The notion of infinitesimal objects needs to be treated in a more rigorous 
mathematical setting. At the same time, we must not discard the great intuitive 
value of this notion as envisioned by the masters who invented calculus, even 
at the risk of some possible confusion! Thus, whereas in the more strict sense 
of modern differential geometry, the velocity is a tangent vector and hence it 
is a differential operator on the space of functions, the quantity dx can be 
viewed as a traditional vector which, at the infinitesimal level, represents a 
linear approximation to the curve and points tangentially in the direction of v. 


1.3. CURVES IN R? 13 


1.3.2 Velocity 


For any smooth function f : R3 — R , we formally define the action of the 
velocity vector field a’(t) as a linear derivation by the formula 


LOH lawy= (oa) le. (1.25) 


The modern notation is more precise, since it takes into account that the veloc- 
ity has a vector part as well as point of application. Given a point on the curve, 
the velocity of the curve acting on a function, yields the directional derivative 
of that function in the direction tangential to the curve at the point in question. 
The diagram in figure 1.6 below provides a more visual interpretation of the 
velocity vector formula 1.25, as a linear mapping between tangent spaces. 


4 E T;R ma Taa) R? =] a’(t) 


| | 


R Sy pa l R 


Fig. 1.6: Velocity Vector Operator 


The map a(t) from R to R induces a push-forward map a, from the 
tangent space of R to the tangent space of R? . The image a,(4) in TR? of 
the tangent vector 4 is what we call a(t). 


a, (d/dt) = a’ (t). 


Since a/(t) is a tangent vector in R3, it acts on functions in R3 . The action of 
a’(t) on a function f on R? is the same as the action of d/dt on the composition 
(foa). In particular, if we apply a’(t) to the coordinate functions x, we get 
the components of the tangent vector 


al )(2") lao= F(a! o a)l (1.26) 


To unpack the above discussion in the simplest possible terms, we associate 
with the classical velocity vector v = x a linear derivation a’(t) given by 


d.. ; 

a(i) = 4 (oa) (0/02°)a¢e. 
dx! 0 dz? ð© dz? a 

Gt Ont + at ETA ron 


So, given a real valued function f in R°, the action of the velocity vector is 
given by the chain rule 


Of dx! . OF dx? Of dx? 
af) = Ox! dt ` Ox? dt gi Ox? dt 


=Vf.v. 


14 CHAPTER 1. VECTORS AND CURVES 


If a(t) is a curve in R” with tangent vector X = a’(t), and F : R” > R™ 
is differentiable map, then FX is a tangent vector to the curve F oa in R”. 
That is, F, maps tangent vectors of a to tangent vectors of F o a. 


1.3.6 Definition Ift = t(s) is a smooth, real valued function and a(t) is a 
curve in R , we say that the curve 8(s) = a(t(s)) is a reparametrization of a. 

A common reparametrization of curve is obtained by using the arc length 
as the parameter. Using this reparametrization is quite natural, since we know 
from basic physics that the rate of change of the arc length is what we call 
speed 


ds A 
v= Ë =el: (1.28) 


The arc length is obtained by integrating the above formula 


dx! \? dx2\? dx? \ ? 
= ÐI dt = | ) dt 1.29 
s= fwo z T z (1.29) 
In practice, it is typically difficult to find an explicit arc length parametrization 
of a curve since not only does one have to calculate the integral, but also one 
needs to be able to find the inverse function t in terms of s. On the other hand, 
from a theoretical point of view, arc length parameterizations are ideal, since 


any curve so parametrized has unit speed. The proof of this fact is a simple 
application of the chain rule and the inverse function theorem. 


B(s) = [alts] 


and any vector divided by its length is a unit vector. Leibnitz notation makes 
this even more self-evident 


dx dxdt _ dx 
ds dtds — g 
dx 
= dt 
IZI 


1.3.7 Example Let a(t) = (a cos wt, asin wt, bt). Then 


v(t) = (—aw sin wt, aw cos wt, b), 


1.3. CURVES IN R? 15 


II 


t 
J y (—aw sin wu)? + (aw cos wu)? + b2 du 
0 


t 
= ih y aw? + b? du 

0 
= ct, where, c= Va?w?+0?. 


The helix of unit speed is then given by 


s(t) 


WS . WS WS 
G(s) = (a cos 7 asin ,b—). 


1.3.3 Frenet Frames 


Let 6(s) be a curve parametrized by arc length and let T (s) be the vector 
T(s) = B'(s). (1.30) 


The vector T(s) is tangential to the curve and it has unit length. Hereafter, we 
will call T the unit tangent vector. Differentiating the relation 


T-T=1, (1.31) 
we get 
2T-T'=0, (1.32) 


so we conclude that the vector T’ is orthogonal to T. Let N be a unit vector 
orthogonal to T, and let « be the scalar such that 


T'(s) =KN(s). (1.33) 


We call N the unit normal to the curve, and « the curvature. Taking the length 
of both sides of last equation, and recalling that N has unit length, we deduce 
that 


k = ||T’(s)]]. (1.34) 


It makes sense to call « the curvature because, if T is a unit vector, then T’(s) 
is not zero only if the direction of T is changing. The rate of change of the 
direction of the tangent vector is precisely what one would expect to measure 
how much a curve is curving. We now introduce a third vector 


B=TxN, (1.35) 


which we will call the binormal vector. The triplet of vectors (T, N, B) forms 
an orthonormal set; that is, 


T-T=N-N=B-B=1, 
T-N=T-B=N-B=0. (1.36) 


16 CHAPTER 1. VECTORS AND CURVES 


If we differentiate the relation B - B = 1, we find that B- B’ = 0, hence B’ is 
orthogonal to B. Furthermore, differentiating the equation T - B = 0, we get 


B’.T+B-T' =0. 
rewriting the last equation 
B’.T=-T'-B=-«kN-B=0, 


we also conclude that B’ must also be orthogonal to T. This can only happen 
if B’ is orthogonal to the T B-plane, so B’ must be proportional to N. In other 
words, we must have 


B'(s) = —TN(s), (1.37) 


for some quantity 7, which we will call the torsion. The torsion is similar to 
the curvature in the sense that it measures the rate of change of the binormal. 
Since the binormal also has unit length, the only way one can have a non-zero 
derivative is if B is changing directions. This means that if in addition B did 
not change directions, the vector would truly be a constant vector, so the curve 
would be a flat curve embedded into the TN-plane. 

The quantity B’ then measures the rate of 
change in the up and down direction of an ob- 
server moving with the curve always facing for- 
ward in the direction of the tangent vector. The 
binormal B is something like the flag in the back 
of sand dune buggy. B 

The set of basis vectors {T, N, B} is called T 
the Frenet frame or the repère mobile (moving N 
frame). The advantage of this basis over the fixed 
{i,j,k} basis is that the Frenet frame is naturally Fig. 1.7: Frenet Frame. 
adapted to the curve. It propagates along the 
curve with the tangent vector always pointing in the direction of motion, and 
the normal and binormal vectors pointing in the directions in which the curve 
is tending to curve. In particular, a complete description of how the curve is 
curving can be obtained by calculating the rate of change of the frame in terms 
of the frame itself. 


1.3.8 Theorem Let 6(s) be a unit speed curve with curvature « and torsion 
T. Then 


T = kN 
N = -kT TB. (1.38) 
B' = -rN 


Proof We need only establish the equation for N’. Differentiating the equation 
N -N =1, we get 2N - N’ = 0, so N’ is orthogonal to N. Hence, N’ must be a 
linear combination of T and B. 


N’ =aT +bB. 


1.3. CURVES IN R? 17 


Taking the dot product of last equation with T and B respectively, we see that 
a=N'-.T, and b=N’-B. 


On the other hand, differentiating the equations N-T = 0, and N- B = 0, we 
find that 


N'.T=—-N-T' =-N-(KN)=-k 
N'.B=—-N.B!=-N.(-1tN) =r. 


We conclude that a = —«K, b = 7, and thus 
N'’=-«T+4+7B. 


The Frenet frame equations (1.38) can also be written in matrix form as shown 
below. 


T 0 k 0 T 
N | =|-k 0 rT N |. (1.39) 
B 0 -=r 0 B 


The group-theoretic significance of this matrix formulation is quite important 
and we will come back to this later when we talk about general orthonormal 
frames. Presently, perhaps it suffices to point out that the appearance of an 
antisymmetric matrix in the Frenet equations is not at all coincidental. 

The following theorem provides a computational method to calculate the 
curvature and torsion directly from the equation of a given unit speed curve. 


1.3.9 Proposition Let 8(s) be a unit speed curve with curvature «x > 0 and 
torsion T. Then 


k = |8”(s)ll 
ae a (1.40) 


Proof If G(s) is a unit speed curve, we have 6’(s) =T. Then 


T' = p" (s) = KN, 
pB” - BY = (KN): (KN), 


B" 7 B" -_ k2 
K? = |6"? 
B”) = KN+kN' 


II 


K'N + «(-KT' +7TB) 
= «N+ —6°T4+47B. 


18 CHAPTER 1. VECTORS AND CURVES 


Bi [BY xB") = T-[KN x (KN +-RT + ssrB) 
= T- [k?B+kr’rTT] 
= kr 
= 2 
= B' . [92 x B"7 
B" . BM 


1.3.10 Example Consider a circle of radius r whose equation is given by 


a(t) = (rcost,rsint, 0). 


Then, 
a’(t) = (-rsint,rcost,0) 
lx Hll = W/(—rsint)? + (rcost)? + 02 
— yr (sin? t + cos? t) 
"5 


Therefore, ds/dt = r and s = rt, which we recognize as the formula for the 
length of an arc of circle of radius r, subtended by a central angle whose measure 
is t radians. We conclude that 


B(s) = (r cos Ê rsin —,0) 
r r 


is a unit speed reparametrization. The curvature of the circle can now be easily 
computed 


T = B'(s) = (—sin =, cos =,0), 
r r 
1 1 
T S cos Å, sin Å, 0), 
ro r r r 
K = |B" = T, 


This is a very simple but important example. The fact that for a circle of radius 
r the curvature is k = 1/r could not be more intuitive. A small circle has large 
curvature and a large circle has small curvature. As the radius of the circle 
approaches infinity, the circle locally looks more and more like a straight line, 
and the curvature approaches 0. If one were walking along a great circle on a 
very large sphere (like the earth) one would be perceive the space to be locally 
flat. 


1.3. CURVES IN R? 19 


1.3.11 Proposition Let a(t) be a curve of 
velocity v, acceleration a, speed v and curva- 
ture «, then 


v = vT, 
dv 


= TORN (1.41) 


Proof Let s(t) be the arc length and let 
B(s) be a unit speed reparametrization. Then Fig. 1.8: 
a(t) = 6(s(t)) and by the chain rule 


= a(t), 
= '(s(t))s‘(t), 
vT. 


Osculating Circle 


Vv 


a = al(t), 


7 dv A ; 
= ae + vT (s(t))s' (t), 
dv 
= wE +v(KN)v, 

dv 2 
= a +u°KN. 


Equation 1.41 is important in physics. The equation states that a particle 
moving along a curve in space feels a component of acceleration along the 
direction of motion whenever there is a change of speed, and a centripetal 
acceleration in the direction of the normal whenever it changes direction. The 
centripetal Acceleration and any point is 


where r is the radius of a circle called the osculating circle. 

The osculating circle has maximal tangential contact with the curve at the 
point in question. This is called contact of order 2, in the sense that the circle 
passes through two nearby in the curve. The osculating circle can be envisioned 
by a limiting process similar to that of the tangent to a curve in differential 
calculus. Let p be point on the curve, and let qı and q2 be two nearby points. If 
the three points are not collinear, they uniquely determine a circle. The center 
of this circle is located at the intersection of the perpendicular bisectors of the 
segments joining two consecutive points. This circle is a “secant” approximation 
to the tangent circle. As the points qı and q2 approach the point p, the “secant” 
circle approaches the osculating circle. The osculating circle, as shown in figure 
1.8, always lies in the TN-plane, which by analogy is called the osculating 
plane. If T’ = 0, then « = 0 and the osculating circle degenerates into a circle 
of infinite radius, that is, a straight line. The physics interpretation of equation 
1.41 is that as a particle moves along a curve, in some sense at an infinitesimal 


20 


CHAPTER 1. VECTORS AND CURVES 


level, it is moving tangential to a circle, and hence, the centripetal acceleration 
at each point coincides with the centripetal acceleration along the osculating 
circle. As the points move along, the osculating circles move along with them, 
changing their radii appropriately. 


1.3.12 


Example (Helix) 


Ws . ws bs 
(acos —,asin —,—), where c= Va?w? + B?, 
c c'c 


aw . WS aw ws b 
( sin cos ), 
C cC c c Cc 
aw? ws aw? . ws 
( cos —, sin —,0), 
2 2 
C c C c 
aw? ws aw? Ws 
( z sin —, z COS ,0), 
C C C c 
B" . B” 
? 
awt 
c? 
aw? 
P 
(8'B” 8”) 
Br A Br a 
b aw? ws aw? an ws 4 
a2 COS E sin 4$, C 
= 3 3 
aw sn WSs aw Ws 2,54? 
c cz SIN = cz COS ~ arw 
baw’ ct 


c Š awt 


Simplifying the last expression and substituting the value of c, we get 


bw 
Ter a2w2 + b2’ 
aw? 
Re St en 


Notice that if b = 0, the helix collapses to a circle in the xy-plane. In this case, 
the formulas above reduce to k = 1/a and r = 0. The ratio x/r = aw/b is 
particularly simple. Any curve for which «/7 = constant, is called a helix; the 
circular helix is a special case. 


1.3. CURVES IN R? 21 


1.3.13 Example (Plane curves) Let a(t) = (a(t), y(t),0). Then 


= (2’,y’,0), 
al = (x "y", 0), 
AA = (ay AA ,0), 
lex"! 
læ 
| aly" a ya" | 
(x2 + y’2)3/2 ` 
rT = O. 


y(s) = fea dt. (1.42) 
0 


Then, using the fundamental theorem of calculus, we have 


2 s2 
sin 


B'(s) = (cos 5y: sin 5. 


202’ 0), 


Since ||6’|| = v = 1, the curve is of unit speed, and s is indeed the arc length. 
The curvature is given by 


K = |r z" = (Boe), 
s. Ê s s2 
= |- singe a sza lh 
— @ 


The functions (1.42) are the classical Fresnel integrals which we will discuss in 
more detail in the next section. 

In cases where the given curve a(t) is not of unit speed, the following propo- 
sition provides formulas to compute the curvature and torsion in terms of a. 


1.3.15 Proposition If a(t) is a regular curve in R? , then 


2 lla’ x a” ||? 
(a’a”al”’) 
= PETJE x a2” (1.44) 


1M PE 


where (a'a”a”) is the triple vector product |a’ x a”’]- al”. 


22 CHAPTER 1. VECTORS AND CURVES 


Proof 
a = vT, 
a” = oT +v°KN, 
al” = (v'«)N+..., 
= KN’ +..., 
vUKTB+.... 


As the computation below shows, the other terms in o” are unimportant here 
because a’ x a” is proportional to B, so all we need is the B component to 
solve for T. 


axa” = wT x N) =v «B, 
la xa" = vx, 
_ læ xa” 
K — ys 
(a! x a) ; a = v7, 
E 7 (a'oa) 
z vk? 
(a'a a") 
z PETJA 


1.4 Fundamental Theorem of Curves 


The fundamental theorem of curves basically states that prescribing a cur- 
vature and torsion as functions of some parameter s, completely determines up 
to position and orientation, a curve B(s) with that given curvature and torsion. 
Some geometrical insight into the significance of the curvature and torsion can 
be gained by considering the Taylor series expansion of an arbitrary unit speed 
curve B(s) about s = 0. 


Since we are assuming that s is an arc length parameter, 


B(s) = 6(0) + 6'(0) ae (1.45) 


B'(0) = T(0)=To 
B"(0) = (KN)(0) = KoNo 
B” (0) = (=k°T +K N 4+ 67 B)(0) = —K2T) + Kh No + KoT0 Bo 


Keeping only the lowest terms in the components of T, N, and B, we get the 
first order Frenet approximation to the curve 


1 1 
B(s) = B(0) + Tos + 5 hoNos + groo Bos”. (1.46) 


1.4. FUNDAMENTAL THEOREM OF CURVES 23 


The first two terms represent the linear approximation to the curve. The first 
three terms approximate the curve by a parabola which lies in the osculating 
plane (['N-plane). If ko = 0, then locally the curve looks like a straight line. 
If 7 = 0, then locally the curve is a plane curve contained on the osculating 
plane. In this sense, the curvature measures the deviation of the curve from 
a straight line and the torsion (also called the second curvature) measures the 
deviation of the curve from a plane curve. As shown in figure 1.9 a non-planar 
space curve locally looks like a wire that has first been bent into a parabolic 
shape in the TN and twisted into a cubic along the B axis. So suppose that p 


Fig. 1.9: Cubic Approximation to a Curve 


is an arbitrary point on a curve G(s) parametrized by arc length. We position 
the curve so that p is at the origin so that (0) = 0 coincides with the point 
p. We chose the orthonormal basis vectors {e1, e2,e3} in R to coincide with 
the Frenet Frame To, No, Bo at that point. then, the equation (1.46) provides 
a canonical representation of the curve near that point. This then constitutes 
a proof of the fundamental theorem of curves under the assumption the curve, 
curvature and torsion are analytic. (One could also treat the Frenet formulas 
as a system of differential equations and apply the conditions of existence and 
uniqueness of solutions for such systems.) 


1.4.1 Proposition A curve with « = 0 is part of a straight line. 
If k = 0 then 6(s) = B(0) + sTo. 


1.4.2 Proposition A curve a(t) with 7 = 0 is a plane curve. 


Proof If 7 = 0, then (a’a”a’”’) = 0. This means that the three vectors a’, a”, 
and a” are linearly dependent and hence there exist functions a1(s),a2(s) and 
a3(s) such that 


WM I / 
aga +aga +aja = 0. 


24 CHAPTER 1. VECTORS AND CURVES 


This linear homogeneous equation will have a solution of the form 
Q = C1Q1 + C202 + C3, C; = constant vectors. 
This curve lies in the plane 
(x—c3):-n=0, where n= c; X cg 


A consequence of the Frenet Equations is that given two curves in space C 
and C* such that «(s) = k*(s) and T(s) =7 * (s), the two curves are the same 
up to their position in space. To clarify what we mean by their ” position” we 
need to review some basic concepts of linear algebra leading to the notion of 
isometries. 


1.4.1 Isometries 


1.4.3 Definition Let x and y be two column vectors in R” and let x? 
represent the transposed row vector. To keep track on whether a vector is 
a row vector or a column vector, hereafter we write the components {2'} of a 
column vector with the indices up and the components {z;} of a row vector with 
the indices down. Similarly, if A is an n x n matrix, we write its components 
as A = (atj). The standard inner product is given by matrix multiplication of 
the row and column vectors 


< x,y > =xľy, (1.47) 
=<y,x>. (1.48) 
The inner product gives R” the structure of a normed space by defining ||x|| =< 


x,x >!/? and the structure of a metric space in which d(x, y) = ||x — y||. The 
real inner product is bilinear (linear in each slot), from which it follows that 


lx yl? = |x]? +2 < x,y > +Hlyll?. (1.49) 


Thus, we have the polarization identity 


< xy >= įlx + yl? — lx- yl’. (1.50) 
The Euclidean inner product satisfies the relation 
< x,y >= ||x|| - |lyl| cos, (1.51) 


where 0 is the angle subtended by the two vectors. 

Two vectors x and y are called orthogonal if < x,y >= 0, and a set of 
basis vectors @ = {e1,...en} is called an orthonormal basis if < e;,e; >= ĝij. 
Given an orthonormal basis, the dual basis is the set of linear functionals {a‘} 
such that a'(e;) = ôi. In terms of basis components, column vectors are given 


1.4. FUNDAMENTAL THEOREM OF CURVES 25 


by x = z'e;, row vectors by x? = xja, and the inner product 


Ley Sexy; 


= (x,0")(y’e;), 
= (xiy ja’ (ej) = (xiy’)d;. 
= tiy, 
y! 
2 
= [x1 TI: Tn] y 
y” 


Since |cos0| < 1, it follows from equation 1.51, a special case of the Schwarz 
inequality 


|<x,y > |< lxil- Iyl] (1.52) 


Let F be a linear transformation from R” to R” and # = {e1,... €n} be an 
orthonormal basis. Then, there exists a matrix A = [F] æ given by 


A = (a5) = a'(F(e;)), (1.53) 


or in terms of the inner product, 
A= (aij) =< e;, F(e;) ae (1.54) 


On the other hand, if A is a fixed n x n matrix, the map F defined by F(x) = 
Ax is a linear transformation from R” to R” whose matrix representation 
in the standard basis is the matrix A itself. It follows that given a linear 
transformation represented by a matrix A, we have 


<x, Ay > = xT Ay, (1.55) 
= (A?x)"y, 
=< ATx,y >. (1.56) 


1.4.4 Definition A real nxn matrix A is called orthogonalif AT A = AAT = 
I. The linear transformation represented by A is called an orthogonal transfor- 
mation. Equivalently, the transformation represented by A is orthogonal if 


< x, Ay >=<A'x,y>. (1.57) 


Thus, real orthogonal transformations are represented by symmetric matrices 
(Hermitian in the complex case) and the condition ATA = I implies that 
det(A) = 1. 


1.4.5 Theorem If A is an orthogonal matrix, then the transformation de- 
termined by A preserves the inner product and the norm. 


26 CHAPTER 1. VECTORS AND CURVES 


Proof 


< Ax, Ay > =< AT Ax,y >, 
=< x,y >. 


Furthermore, setting y = x: 


< Ax, Ax > =< x,x>, 
|| Ax||? = |x|”, 


|| Ax|| = Ixl]. 


As a corollary, if {e;} is an orthonormal basis, then so is {f; = Ae;}. That is, 
an orthogonal transformation represents a rotation if det A = 1 and a rotation 
with a reflection if det A = —1. 


1.4.6 Definition A mapping F : R” > R” called an isometry if it preserves 
distances. That is, if for all x, y 


d( F(x), F(y)) = d(x,y). (1.58) 


1.4.7 Example (Translations) Let q be fixed vector. The map F(x) 
x+q is called a translation. It is clearly an isometry since || F(x) — F(y)|| 
IIx+p—(y+p)ll = lx- yl. 


1.4.8 Theorem An orthogonal transformation is an isometry. 
Proof Let F be an isometry represented by an orthogonal matrix A. Then, 
since the transformation is linear and preserves norms, we have: 


d( F(x), F(x)) = ||Ax — Ay], 
= ||4(x -= y)|l, 
= |lx—yll 


The composition of two isometries is also an isometry. The inverse of a 
translation by q is a translation by —q. The inverse of an orthogonal transfor- 
mation represented by A is an orthogonal transformation represented by A7!. 
Consequently, the set of isometries consisting of translations and orthogonal 
transformations constitutes a group. Given a general isometry, we can use a 
translation to insure that F'(0) = 0. We now prove the following theorem. 


1.4.9 Theorem If F is an isometry such that F(0) = 0, then F is an 
orthogonal transformation. 
Proof We need to prove that F preserves the inner product and that it is 


1.4. FUNDAMENTAL THEOREM OF CURVES 


linear. We first show that F preserves norms. In fact 


| F(x)|| = d(F(x), 0), 
= d(F(x), F(0), 


Now, using 1.49 and the norm preserving property above, we have: 


d(F(x), F(y)) = d(x,y), 
IF) — Fy)? = lix- yl’, 
IES)? — 2 < F(x), Fly) > HEOI? = Ixl? -2< x,y > +yli’. 
< F(x), Fly) > =< x,y >. 


27 


To show F is linear, let e; be an orthonormal basis, which implies that f; = F (e;) 


is also an orthonormal basis. Then 


F(ax + by) = S> < F(ax + by, fj > fi, 


= 5 < F(ax + by), F(e;) > f; 


= 5 < (ax + by), e; > fi, 


n n 
aS) <x,e,>f+b) > <y,e; > fi, 


i=l i=l 
=a) < F(x), fi > fi +0) < Fly), fi > fi, 
i=l 


i=1 


= aF (x) + bF(y). 


1.4.10 Theorem If F: R” — R” is an isometry then 


F(x) = Ax +q, (1.59) 


where A is orthogonal. 


Proof If F(0 = q, then F =F-—qisan isometry with F(0) = 0 and hence 
by the previous theorem F is an orthogonal transformation represented by an 


orthogonal matrix Fx = Ax. It follows that F(x) = Ax + q. 


We have just shown that any isometry is the composition of translation and 
an orthogonal transformation. The latter is the linear part of the isometry. 
The orthogonal transformation preserves the inner product, lengths, and maps 


orthonormal bases to orthonormal bases. 


28 CHAPTER 1. VECTORS AND CURVES 


1.4.11 Theorem If a is a curve in R” and £ is the image of œa under a 
mapping F’, then vectors tangent to a get mapped to tangent vectors to 2. 
Proof Let 6 = Foa. The proof follows trivially from the properties of the 
Jacobian map 3, = (Foa), = F.a, that takes tangent vectors to tangent 
vectors. If in addition F is an isometry, then F, maps the Frenet frame of a to 
the Frenet frame of 8. 


We now have all the ingredients to prove the following: 


1.4.12 Theorem (Fundamental theorem of curves) If C and Č are space 
curves such that «(s) = &(s), and 7(s) = 7(s) for all s, the curves are isometric. 
Proof Given two such curves, we can perform a translation so that, for some 
s = Sg, the corresponding points on C and C are made to coincide. Without 
loss of generality, we can make this point be the origin. Now we perform an 
orthogonal transformation to make the Frenet frame {To, No, Bo} of C coincide 
with the Frenet frame {T), No, Bo} of C. By Schwarz inequality, the inner 
product of two unit vectors is also a unit vector, if and only if the vectors are 
equal. With this in mind, let 


L=T- +N- Ñ+B.-B. 


A simple computation using the Frenet equations shows that L’ = 0, so L = 
constant. But at s = 0 the Frenet frames of the two curves coincide, so the 
constant is 3 and this can only happen if for all s, T = T, N=N, B=B. 
Finally, since T = T, we have ’(s) = f’(s), so B(s ) = B(s)+ constant. But 
since 6(0) = A(0), the constant is 0 and 6(s) = A(s) for all s. 


1.4.2 Natural Equations 


The fundamental theorem of curves states that up to an isometry, that is 
up to location and orientation, a curve is completely determined by the curva- 
ture and torsion. However, the formulas for computing « and 7 are sufficiently 
complicated that solving the Frenet system of differential equations could be a 
daunting task indeed. With the invention of modern computers, obtaining and 
plotting numerical solutions is a routine matter. There is a plethora of differ- 
ential equations solvers available nowadays, including the solvers built-in into 
Maple, Mathematica, and Matlab. For plane curves, which are characterized 


Fig. 1.10: Tangent 


by T = 0, it is possible to find an integral formula for the curve coordinates in 


1.4. FUNDAMENTAL THEOREM OF CURVES 29 


terms of the curvature. Given a curve parametrized by arc length, consider an 
arbitrary point with position vector x = (x,y) on the curve, and let y be the 
angle that the tangent vector T makes with the horizontal, as shown in figure 
1.10. Then, the Euclidean vector components of the unit tangent vector are 
given by 

x = T = (cos y, siny). 


This means that 


d 
c= cosy, and Y _ sing. 


ds ds 
From the first Frenet equation we also have 
dT . dy dy 
= = N 
T; ( sin gag cose) KN, 
so that, 
dT| dp _ 
ds || ds — ve 


We conclude that 
x(s) = f cosy ds, y(s)= [sue ds,where, y= fras (1.60) 


Equations 1.60 are called the natural equations of a plane curve. Given the 
curvature K, the equation of the curve can be obtained by “quadratures,” the 
classical term for integrals. 


1.4.13 Example Circle: k =1/R 

The simplest natural equation is one where the curvature is constant. For 

obvious geometrical reasons we choose this constant to be 1/R. Then, p = s/R 

and 7 : 
x = (Rsin =, — R cos =), 


which is the equation of a unit speed circle of radius R. 


1.4.14 Example Cornu spiral: k = 7s 
This is the most basic linear natural equation, except for the scaling factor of 
m which is inserted for historical conventions. Then y = ams”, and 


x(s) = C(s) = J Gr) ds; y(s) = S(s) = fsg) ds. (1.61) 


The functions C(s) and S(s) are called Fresnel Integrals. In the standard clas- 
sical function libraries of Maple and Mathematica, they are listed as FresnelC 
and FresnelS respectively. The fast-increasing frequency of oscillations of the 
integrands here make the computation prohibitive without the use of high-speed 
computers. Graphing calculators are inadequate to render the rapid oscillations 
for s ranging from 0 to 15, for example, and simple computer programs for the 


30 CHAPTER 1. VECTORS AND CURVES 


a) b) 
Fig. 1.11: Fresnel Diffraction 


trapezoidal rule as taught in typical calculus courses, completely fall apart in 
this range. The Cornu spiral is the curve x(s) = (x(s),y(s)) parametrized by 
Fresnel integrals (See figure 1.1la). It is a tribute to the mathematicians of 
the 1800’s that not only were they able to compute the values of the Fresnel 
integrals to 4 or 5 decimal places, but they did it for the range of s from 0 to 
15 as mentioned above, producing remarkably accurate renditions of the spiral. 
Fresnel integrals appear in the study of diffraction. If a coherent beam of light 
such as a laser beam, hits a sharp straight edge and a screen is placed behind, 
there will appear on the screen a pattern of diffraction fringes. The amplitude 
and intensity of the diffraction pattern can be obtained by a geometrical con- 
struction involving the Fresnel integrals. First consider the function U(s) = ||x|| 
that measures the distance from the origin to the points in the Cornu spiral in 
the first quadrant. The square of this function is then proportional to the in- 
tensity of the diffraction pattern, The graph of |W(s)|? is shown in figure 1.11b. 
Translating this curve along an axis coinciding with that of the straight edge, 
generates a three dimensional surface as shown from ”above” in figure 1.11c. A 
color scheme was used here to depict a model of the Fresnel diffraction by the 
straight edge. 


1.4.15 Example Logarithmic Spiral k = 1/(as + b) 

A logarithmic spiral is a curve in which the position vector x makes a constant 
angle with the tangent vector, as shown in figure 1.12. A formula for the curve 
can be found easily if one uses the calculus formula in polar coordinates 


r 


tan w = dr/dð' 


(1.62) 


Here, w is the angle between the polar direction and the tangent. If w is con- 
stant, then one can immediately integrate the equation to get the exponential 
function below, in which k is the constant of integration 


r(0) = kelt ¥)8 (1.63) 


Derivation of formula 1.62 has fallen through the cracks in standard fat cal- 
culus textbooks, at best relegated to an advanced exercise which most students 


1.4. FUNDAMENTAL THEOREM OF CURVES 31 


Fig. 1.12: Logarithmic Spiral 


will not do. Perhaps the reason is that the section on polar coordinates is typi- 
cally covered in Calculus II, so students have not yet been exposed to the tools 
of vector calculus that facilitate the otherwise messy computation. To fill-in 
this gap, we present a short derivation of this neat formula. For a plane curve 
in parametric polar coordinates, we have 


x(t) = (r(t) cos 0(t), r(t) sin 0(t)), 


x = (ř cos — rsin 6,7sin@ + rcosé 6). 
A direct computation of the dot product gives, 
[4s Par. 
On the other hand, 


|< x, x > |? = lix? Iž]? cos” y, 


= r? (t? + 1767) cos? y. 
Equating the two, we find, 


7? = (#? + 1776) cos? y, 
(sin? Y)? = 6? cos? a, 
(sin Y) dr = r cos y dé, 
r 


tan Y = dr/dð' 


We leave it to the reader to do a direct computation of the curvature. Instead, 
we prove that if x = 1/(as + b), where a and b are constant, then the curve is 


32 CHAPTER 1. VECTORS AND CURVES 


a logarithmic spiral. From the natural equations, we have, 


dé. i 

ds TET istb 
1 

6=—In(as+6)+C, C= const, 
a 


Back to the natural equations, the x and y coordinates are obtained by inte- 
grating, 


g= jae cos 6 dé, 
y= fa sin 0 dé. 
We can avoid the integrations by parts by letting z = z + iy = re’’. We get 


Z= Af eset dé, 


A it etde ag. 
A 


gerne. 


II 


att 
A 


eP et, 
a+ 


II 


Extracting the real part ||z|| = r, we get 


A a0 
= —— e ; 
Vaz +1 


which is the equation of a logarithmic spiral with a = cot ~. As shown in figure 
1.12, families of concentric logarithmic spirals are ubiquitous in nature as in 
flowers and pine cones, in architectural designs. The projection of a conical 
helix as in figure 4.8 onto the plane through the origin, is a logarithmic spiral. 
The analog of a logarithmic spiral on a sphere is called a loxodrome as depicted 
in figure 4.2. 


r 


(1.64) 


1.4.16 Example Meandering Curves: « = sin s 

A whole family of meandering curves are obtained by letting « = Asin ks. 
The meandering graph shown in picture 1.13 was obtained by numerical inte- 
gration for A = 2 and “wave number” k = 1. The larger the value of A the 
larger the curvature of the “throats.” If A is large enough, the “throats” will 
overlap. 


1.4. FUNDAMENTAL THEOREM OF CURVES 33 


Fig. 1.13: Meandering Curve 


CoS BP E 


Fig. 1.14: Bimodal Meander 


Using superpositions of sine functions gives rise to a beautiful family of “multi- 
frequency” meanders with graphs that would challenge the most skillful cal- 
ligraphists of the 1800’s. Figure 1.14 shows a rendition with two sine functions 
with equal amplitude A = 1.8, and with kı = 1, ko = 1.2. 


Chapter 2 


Differential Forms 


2.1 One-Forms 


The concept of the differential of a function is one of the most puzzling ideas 
in elementary calculus. In the usual definition, the differential of a dependent 
variable y = f(x) is given in terms of the differential of the independent variable 
by dy = f'(x)dx. The problem is with the quantity dx. What does “dx” mean? 
What is the difference between Ax and dz? How much “smaller” than Ax does 
dx have to be? There is no trivial resolution to this question. Most introductory 
calculus texts evade the issue by treating dx as an arbitrarily small quantity 
(lacking mathematical rigor) or by simply referring to dx as an infinitesimal 
(a term introduced by Newton for an idea that could not otherwise be clearly 
defined at the time.) 

In this section we introduce linear algebraic tools that will allow us to in- 
terpret the differential in terms of a linear operator. 


2.1.1 Definition Let p € R”, and let T,(R”) be the tangent space at p. 
A 1-form at p is a linear map ¢ from T,(R”) into R, in other words, a linear 
functional. We recall that such a map must satisfy the following properties: 


a) o(Xp) E R, VX, € R” (2.1) 
b) (aX, + bY,) = ad(X,) + b4(¥,), Va,b ER, Xp, Yp € T(R”) 


A 1-form is a smooth assignment of a linear map ¢ as above for each point in 
the space. 


2.1.2 Definition Let f : R” — R be a real-valued C™ function. We define 
the differential df of the function as the 1-form such that 


df(X) = X(f), (2.2) 


for every vector field in X in R”. In other words, at any point p, the differential 
df of a function is an operator that assigns to a tangent vector X, the directional 


34 


2.1. ONE-FORMS 35 


derivative of the function in the direction of that vector. 


df(X)(p) = Xp(f) = VI (p) : X(p). (2.3) 


In particular, if we apply the differential of the coordinate functions x’ to the 
basis vector fields, we get 


o ox’ i 
Dal? aa Oy. (2.4) 
The set of all linear functionals on a vector space is called the dual of the 
vector space. It is a standard theorem in linear algebra that the dual of a finite 
dimensional vector space is also a vector space of the same dimension. Thus, 
the space T*(R") of all 1-forms at p is a vector space which is the dual of 
the tangent space T,(R”). The space T(R”) is called the cotangent space of 
R” at the point p. Equation (2.4) indicates that the set of differential forms 
{(dx'),,...,(dx”"),} constitutes the basis of the cotangent space which is dual 
to the standard basis {(s2r)p, ar (a )p} of the tangent space. The union of all 
the cotangent spaces as p ranges over all points in R” is called the cotangent 
bundle T*(R”). 


da’ ( 


2.1.3 Proposition Let f be a smooth function in R” and let {z',...2”} be 
coordinate functions in a neighborhood U of a point p. Then, the differential 
df is given locally by the expression 


df = of da’ (2.5) 
mT Ox* 
Of ,; 
e — d f 
ari” 
Proof The differential df is by definition a 1-form, so, at each point, it must be 
expressible as a linear combination of the basis {(dx'),,...,(dz"),}. Therefore, 


to prove the proposition, it suffices to show that the expression 2.5 applied to 
an arbitrary tangent vector coincides with definition 2.2. To see this, consider 
a tangent vector Xp = v (525) and apply the expression above as follows: 


of 


( ðf; ð 
Ox? 


(SF dav o) (2.6) 
jr in 0 

= (Ê a o) 

pt 

vE ao) 
= (Eo) 
= Vf(p)-x(p) 
= A(X) 


II 


da") p(X) 


36 CHAPTER 2. DIFFERENTIAL FORMS 


The definition of differentials as linear functionals on the space of vector fields is 
much more satisfactory than the notion of infinitesimals, since the new definition 
is based on the rigorous machinery of linear algebra. If a is an arbitrary 1-form, 
then locally 


a =a;(x)dz'+,...+an(x)dz”, (2.7) 


where the coefficients a; are C™ functions. Thus, a 1-form is a smooth section of 
the cotangent bundle and we refer to it as a covariant tensor of rank 1, or simply 
a covector. The collection of all 1-forms is denoted by 0'(R”) = ZP (R”). The 
coefficients (a1,...,@,,) are called the covariant components of the covector. We 
will adopt the convention to always write the covariant components of a covector 
with the indices down. Physicists often refer to the covariant components of a 
1-form as a covariant vector and this causes some confusion about the position 
of the indices. We emphasize that not all one forms are obtained by taking the 
differential of a function. If there exists a function f, such that a = df, then 
the one form a is called exact. In vector calculus and elementary physics, exact 
forms are important in understanding the path independence of line integrals 
of conservative vector fields. 

As we have already noted, the cotangent space Tš (R”) of 1-forms at a point 
p has a natural vector space structure. We can easily extend the operations of 
addition and scalar multiplication to the space of all 1-forms by defining 


(a + B)(X) 
(fo)(X) 


a(X) + B(X) (2.8) 
fa(X) 


II 


for all vector fields X and all smooth functions f. 


2.2 Tensors 


As we mentioned at the beginning of this chapter, the notion of the differen- 
tial dz is not made precise in elementary treatments of calculus, so consequently, 
the differential of area drdy in R?, as well as the differential of surface area in 
R? also need to be revisited in a more rigorous setting. For this purpose, we 
introduce a new type of multiplication between forms that not only captures 
the essence of differentials of area and volume, but also provides a rich algebraic 
and geometric structure generalizing cross products (which make sense only in 
R) to Euclidean space of any dimension. 


2.2.1 Definition A map ¢: 2(R") x #(R") — R is called a bilinear 
map of vector fields, if it is linear on each slot. That is, VX;,Y; € 2(R”), fie 
F(R”), we have 


Of Xi + f? XN) = FAXY) + f?o(X2,N) 
(Xi, fF Vi + fP¥2) = FAX, Y) + FoX, Yo). 


2.2. TENSORS 37 


2.2.1 Tensor Products 


2.2.2 Definition Let a and 8 be 1-forms. The tensor product of a and £ is 
defined as the bilinear map a ® 8 such that 


(a 8 B)(X,Y) = a(X)6(Y) (2.9) 


for all vector fields X and Y. 
Thus, for example, if a = aidz’ and 8 = b;dx’, then, 


ð ə ð ð 
lS AE aa) = gbg 
8  @ 
= (aida')(x r) bde (a7) 
= aj6j,b;6) 
= akbı. 


A quantity of the form T = T,j;dz* & dxi is called a covariant tensor of rank 
2, and we may think of the set {dxf @ dx/} as a basis for all such tensors. 
The space of covariant tensor fields of rank 2 is denoted %2(R”). We must 
caution the reader again that there is possible confusion about the location of 
the indices, since physicists often refer to the components Tj; as a covariant 
tensor of rank two, as long is it satisfies some transformation laws. 

In a similar fashion, one can define the tensor product of vectors X and Y 
as the bilinear map X & Y such that 


(X@Y)(f,9) = X(f)Y(9) (2.10) 


for any pair of arbitrary functions f and g. 


If X = aṣ% and Y = b 5%, then the components of X @ Y in the basis 


® 32; are simply given by a’b’. Any bilinear map of the form 


ð 
Oxt 


o o 


T=T%— g 
Ox? 8 Oxi 


(2.11) 


is called a contravariant tensor of rank 2 in R” . The notion of tensor products 
can easily be generalized to higher rank, and in fact one can have tensors of 
mixed ranks. For example, a tensor of contravariant rank 2 and covariant rank 
1 in R” is represented in local coordinates by an expression of the form 


=z ð f) 

T =T" = 9 Oar”, 

k Əxi ~ Əxi 

This object is also called a tensor of type 2). Thus, we may think of a tensor of 
type (Gh as a map with three input slots. The map expects two functions in the 
first two slots and a vector in the third one. The action of the map is bilinear 
on the two functions and linear on the vector. The output is a real number. 


38 CHAPTER 2. DIFFERENTIAL FORMS 


A tensor of type (") is written in local coordinates as 


— miris- a J Js 
TST Jra 8&8 Bats Q dx Q... dx (2.12) 
The tensor components are given by 
rae S) a 
iB yeas tr t tr 
Teg? = T(de",...,dx Fah Dale)” (2.13) 


The set T7|,(R”) of all tensors of type Tẹ at a point p has a vector space 
structure. The union of all such vector spaces is called the tensor bundle, and 
smooth sections of the bundle are called tensor fields 7,7(R”); that is, a tensor 
field is a smooth assignment of a tensor to each point in R”. 


2.2.2 Inner Product 


Let X =a'~& and Y = bi 32, be two vector fields and let 


Ox* xi 


g(X, Y) = ôijab. (2.14) 


The quantity g(X,Y) is an example of a bilinear map that the reader will 
recognize as the usual dot product. 


2.2.3 Definition A bilinear map g(X,Y) =< X,Y > on vectors is called a 
real inner product if 


1. g(X,Y) = g(Y, X), 
2. g(X,X) > 0, YX, 
3. g(X, X) =O iff X =0. 


Since we assume g(X, Y) to be bilinear, an inner product is completely specified 
by its action on ordered pairs of basis vectors. The components gij of the inner 
product are thus given by 


ð ə 
IE pa) = ij, (2.15) 


where gij is a symmetric n x n matrix which we assume to be non-singular. By 
linearity, it is easy to see that if X = a’ ah and Y = b ah are two arbitrary 
vectors, then 


< X,Y >= 9(X,Y) = gjah. 


In this sense, an inner product can be viewed as a generalization of the dot 
product. The standard Euclidean inner product is obtained if we take gj; = 4;;. 
In this case, the quantity g(X, X) =|| X ||? gives the square of the length of the 
vector. For this reason, gij is called a metric and g is called a metric tensor. 


2.2. TENSORS 39 


Another interpretation of the dot product can be seen if instead one consid- 
ers a vector X = a’ aa and a 1-form a = bjdx) . The action of the 1-form on 


the vector gives 


Ox 
= bja (de!) (x2) 
= bja'd! 


= abi. 


a(X) = (bjdz’)(a z) 


If we now define : 
bi = giz’, (2.16) 


we see that the equation above can be rewritten as 
abi = gija'b’, 


and we recover the expression for the inner product. 

Equation (2.16) shows that the metric can be used as a mechanism to lower 
indices, thus transforming the contravariant components of a vector to covariant 
ones. If we let g’? be the inverse of the matrix gij, that is 


g” grj = Ô, (2.17) 
we can also raise covariant indices by the equation 
b = g” bj. (2.18) 


We have mentioned that the tangent and cotangent spaces of Euclidean space 
at a particular point p are isomorphic. In view of the above discussion, we see 
that the metric g can be interpreted on one hand as a bilinear pairing of two 
vectors 

g : T(R”) x T(R”) — R, 


and on the other, as inducing a linear isomorphism 
G : T(R”) — T(R”) 
defined by 


that maps vectors to covectors. To verify this definition is consistent with the 
action of lowering indices, let X = a’ 2. and Y = b sh. We show that that 
G,X =a; dx’. In fact, 


40 CHAPTER 2. DIFFERENTIAL FORMS 


The inverse map G? : T(R”) — T,(R”) is defined by 
< Gla, X >= a(X), (2.20) 


for any 1-form a and tangent vector X. In quantum mechanics, it is common 
to use Dirac’s notation, in which a linear functional œ on a vector space V is 
called a bra-vector denoted by (a|, and a vector X € V is called a ket-vector, 
denoted by |X). The, action of a bra-vector on a ket-vector is defined by the 
bracket, 

(a|X) = a(X). (2.21) 


Thus, if the vector space has an inner product as above, we have 
(a|X) =< Gla, X >= a(X). (2.22) 


The mapping C : T7(R”) > R given by (a,X) + (a|X) = a(X) is called 
a contraction. In passing, we introduce a related concept called the interior 
product, or contraction of a vector and a form. If a is a (k + 1)-form and X a 
vector, we define 


ixa(X1,...,X_) = a(X, X1,..., Xk). (2.23) 
In particular, for a one form, we have 


ixa = (a|X) = a(X). 


1 


If T is a type G 


) tensor, that is, 


iad 
T = T jde ae 


The contraction of the tensor is given by 


C(T) = T‘; (dx |a), 


=T}; ô$, 
= T*;. 


In other words, the contraction of the tensor is the trace of the n x n array that 
represents the tensor in the given basis. The notion of raising and lowering 
indices as well as contractions can be extended to tensors of all types. Thus, 
for example, we have 
g” Tikim = T' kim: 
A contraction between the indices i and l in the tensor above could be denoted 
by the notation 
C4 (T" kim) = T kim = Tir: 


This is a very simple concept, but the notation for a general contraction is a 
bit awkward because one needs to keep track of the positions of the indices 


2.2. TENSORS 41 


contracted. Let T be a tensor of type (2). A contraction C yields a tensor of 
type (z): Let T be given in the form 2.12. Then, 


CL(T) — pii- mMtiyi te -—Q.. 8328. . @52, 8dr" ®. ; @drik@, X dx’, 


Ji-+-Ik—-1;MJk4+1---Js OLL 


where the “hat” means that these are excluded. Here is a very neat and most 
useful result. If S is a 2-tensor with symmetric components Tj; = Tj; and A is 
a 2-tensor with antisymmetric components A’? = — A", then the contraction 


Sij A” =0 (2.24) 


The short proof uses the fact that summation indices are dummy indices and 
they can be relabeled at will by any other index that is not already used in an 
expression. We have 


Sig A¥ = SpA = -IAF = -Su A" = -9A = 0, 
since the quantity is the negative of itself. 


In terms of the vector space isomorphism between the tangent and cotangent 
space induced by the metric, the gradient of a function f, viewed as a differential 
geometry vector field, is given by 


Grad f = G*df, (2.25) 


or in components 
(VA =V =g” fj, (2.26) 
where f,; is the commonly used abbreviation for the partial derivative with 
respect to x. 
In elementary treatments of calculus, authors often ignore the subtleties of 
differential 1-forms and tensor products and define the differential of arc length 


as or 
ds? = gj ;dr' da’, 


although what is really meant by such an expression is 


d3? = gi;dz' Q dz’. (2.27) 


2.2.4 Example In cylindrical coordinates, the differential of arc length is 
ds? = dr? + r7d6? + dz”. (2.28) 


In this case, the metric tensor has components 


1 0 0 
Jij = 0 r? 0 R (2.29) 
0 0 1 


42 CHAPTER 2. DIFFERENTIAL FORMS 


2.2.5 Example In spherical coordinates, 


x =rsinl coso, 

y =rsinésing, 
ž = r cos, (2.30) 

and the differential of arc length is given by 

ds? = dr? + r°d0? + r?° sin? 0 do? (2.31) 

In this case the metric tensor has components 

1 0 0 

Jij = 0 r? 0 š (2.32) 


0 0 r?sing? 


2.2.3 Minkowski Space 


An important object in mathematical physics is the so-called Minkowski 
space which is defined as the pair (Ma,3), n), where 


Ma) = {(t,2', x, 2°)| t, zt € R} (2.33) 
and ņ is the bilinear map such that 


n(X, X) = P = (1t)? = (27)? = (x° P. (2.34) 


The matrix representing Minkowski’s metric 7 is given by 
n= diag(1, —1, =l; —1), 


in which case, the differential of arc length is given by 


d? = nudar” @ dx” 
= dt&dt-— dr! ® dr! — dx? & dz? — dz? Q dz? 
dt? — (dax*)? — (dx*)? — (dx?)*. (2.35) 


Note: Technically speaking, Minkowski’s metric is not really a metric since 
n(X, X) = 0 does not imply that X = 0. Non-zero vectors with zero length are 
called light-like vectors and they are associated with particles that travel at the 
speed of light (which we have set equal to 1 in our system of units.) 

The Minkowski metric 7,,, and its matrix inverse 7” are also used to raise 
and lower indices in the space in a manner completely analogous to R” . Thus, 
for example, if A is a covariant vector with components 


A, = (p, A1, A2, A3), 
then the contravariant components of A are 
A" = A, 
= (p,—A,—Ao2, —As). 


2.2. TENSORS 43 


2.2.4 Wedge Products and 2-Forms 


2.2.6 Definition A map ¢ : T(R”) x T(R”) — R is called alternating if 


The alternating property is reminiscent of determinants of square matrices that 
change sign if any two column vectors are switched. In fact, the determinant 
function is a model of an alternating bilinear map on the space Məx2 of two 
by two matrices. Of course, for the definition above to apply, one has to view 
Mp2 x2 as the space of column vectors. 


2.2.7 Definition A 2-form ¢ isa map ¢: T(R”) x T(R”) — R which is 
alternating and bilinear. 


2.2.8 Definition Let a and ĝ be 1-forms in R” and let X and Y be any 
two vector fields. The wedge product of the two 1-forms is the map aA 6 : 
T(R”) x T(R”) — R, given by the equation 


(2.36) 


2.2.9 Theorem If a and $ are 1-forms, then a A £ is a 2-form. 
Proof Let a and £ be 1-forms in R” and let X and Y be any two vector fields. 
Then 


(AA B)X,Y) = af{X)A(Y) —a(¥)8(X) 
—(a(¥)8(X) — a(X)B(Y)) 
= —(aAB)(Y,X). 


Thus, the wedge product of two 1-forms is alternating. 

To show that the wedge product of two 1-forms is bilinear, consider 1-forms, 
a, 8, vector fields X,,X2,Y and functions ft, f?. Then, since the 1-forms are 
linear functionals, we get 


(a A BIFX + f?X2,Y) = a( ft X1 + f?X2)B(Y) -aY LX + f? X) 
= [fla(X1) + fPa(X2))6(Y) -aF BX) + £78(X2)] 
= fla(X1)B(Y) + fPa(X2)B(Y) — fra(¥)B(X1) — fra(¥)B(X2) 
= f'la(X1)8(Y) — a(¥)8(X1)] + f?la(X2)B(Y) — aY bX) 
= f'(a A BXL Y) + f’ la B)(X2,Y). 
The proof of linearity on the second slot is quite similar and is left to the reader. 


The wedge product of two 1-forms has characteristics similar to cross prod- 
ucts of vectors in the sense that both of these products anti-commute. This 


44 CHAPTER 2. DIFFERENTIAL FORMS 


means that we need to be careful to introduce a minus sign every time we 
interchange the order of the operation. Thus, for example, we have 


dx’ A dxi = —dxi ^ dz’ 


if i Æ j, whereas ; 
dx’ \ dz’ = —dz' A dx’ = 0 


since any quantity that equals the negative of itself must vanish. 
2.2.10 Example Consider the case of R?. Let 


a =a dx +b dy, 
b =cdz +d dy. 


since dx \ dx = dy ^ dy = 0, and dz A dy = —dy ^ dz, we get, 


a ^ p = ad dz ^ dy + bc dy ^ dz, 
= ad dz ^ dy — bc dz ^ dy, 


a b 
d 


| dx ^ dy. 


The similarity between wedge products is even more striking in the next exam- 
ple, but we emphasize again that wedge products are much more powerful than 
cross products, because wedge products can be computed in any dimension. 


2.2.11 Example For combinatoric reasons, it is convenient to label the co- 
ordinates as {x!, x7, x3}. Let 


a =a, dx! + az dz? + as dz, 


B = bı dz’ + bz dz? + bs dx’, 
There are only three independent basis 2-forms, namely 


dy^dz= dz? Adz’, 
dz \ dz = —dz A dz’, 
dx \dy= dz! A dz’. 


Computing the wedge products in pairs, we get 


a 


b 


a, a3 
bı b3 


anb = dz? A dx? + dx! A dx? + |," a dz! A dx’. 
1 02 


a2 a3 
bo b 


If we consider vectors a = (a, @2,a3) and b = (b1, bo, b3), we see that the result 
above can be written as 


a ^B = (ax b); dz? Adz? — (ax b)z dx! A dz? + (a x b)3 dx’ A dx? (2.37) 


2.2. TENSORS 45 


It is worthwhile noticing that if one thinks 
of the indices in the formula above as permu- 
tations of the integers {1, 2,3}, the signs of the 
three terms correspond to the signature of the 
permutation. In particular, the middle term in- 
dices constitute an odd permutation, so the sig- 
nature is minus one. One can get a good sense of 
the geometrical significance and the motivation 
for the creation of wedge products by consider- 
ing a classical analogy in the language of vector 
calculus. As shown in figure 2.1, let us consider 
infinitesimal arc length vectors i dx, j dy and 
k dz pointing along the coordinate axes. Recall 
from the definition, that the cross product of two vectors is a new vector whose 
magnitude is the area of the parallelogram subtended by the two vectors and 
which points in the direction of a unit vector perpendicular to the plane con- 
taining the two vectors, oriented according to the right hand rule. Since i, j and 
k are mutually orthogonal vectors, the cross product of any pair is again a unit 
vector pointed in the direction of the third or the negative thereof. Thus, for 
example, in the ry-plane the differential of area is really an oriented quantity 
that can computed by the cross product (i dx x j dy) = dx dy k. A similar 
computation yields the differential of areas in the other two coordinate planes, 
except that in the xz-plane, the cross product needs to be taken in the reverse 
order. In terms of wedge products, the differential of area in the xy-plane is 
(dx ^ dy), so that the oriented nature of the surface element is built-in. Tech- 
nically, when reversing the order of variables in a double integral one should 
introduce a minus sign. This is typically ignored in basic calculus computations 
of double and triple integrals, but it cannot be ignored in vector calculus in the 
context of flux of a vector field through a surface. 


Fig. 2.1: Area Forms 


2.2.12 Example One could of course compute wedge products by just using 
the linearity properties. It would not be as efficient as grouping into pairs, but 
it would yield the same result. For example, let 

a = xz?°dx — y*dy and 8 = dz + dy — 2xydz. Then, 


aNB = (a7dx — ydy) A (dx + dy — 2xydz) 
= «x dxAdxt+ a? dx A dy — 2x°y dx Adz — y? dy A dz 
—y* dy ^ dy + 2ay? dy ^ dz 
= x dx Ady — 22° ydx Adz — y? dy A dx + 2xy dy A dz 
= (xz? 4+ y*) dx A dy — 2a°y dz A dz + 2ry? dy A dz. 
In local coordinates, a 2-form can always be written in components as 


$ = Fij dz? A dx (2.38) 


If we think of F as a matrix with components F;j, we know from linear algebra 
that we can write F uniquely as a sum of a symmetric and an antisymmetric 


46 CHAPTER 2. DIFFERENTIAL FORMS 


matrix, namely, 


F=S+A, 
1 T 1 T 
A )+5(F-F ), 
Fij = Fag) + Pap, 
where, 
1 
Faj = 5 (Fig + Fy), 
1 
Fij = 5 Fis =P); 


are the completely symmetric and antisymmetric components. Since dx’ A dx’ 
is antisymmetric, and the contraction of a symmetric tensor with an antisym- 
metric tensor is zero, one may assume that the components of the 2-form in 
equation 2.38 are antisymmetric as well. With this mind, we can easily find a 
formula using wedges that generalizes the cross product to any dimension. 

Let a = a;dzt and 3 = b;dx* be any two 1-forms in R” , and Let X and Y 
be arbitrary vector fields. Then 


(@ABYXY) = (uda (X) (bjd Y) — (asde")(¥)(0jd0?)(X) 
= (a;bj)|dx'(X)dx’ (Y) — dx’ (Y)dx’ (X)| 
(a;b;)(dx’ A dx!)(X,Y). 


II 


Because of the antisymmetry of the wedge product, the last of the above equa- 
tions can be written as 


anp = 5 X (aibj — a;b;)(dx' ^ dz’), 
i=l j<i 
1 ; ; 
= 5 (aby ajbi) (dx A dx). 
In particular, if n = 3, the reader will recognize the coefficients of the wedge 
product as the components of the cross product of a = aii + aj + a3k and 
b = bii + b2j + b3k, as shown earlier. 


Remark Quantities such as dx dy and dy dz which often appear in calculus II, 
are not really well defined. What is meant by them are actually wedge products 
of 1-forms, but in reversing the order of integration, the antisymmetry of the 
wedge product is ignored. In performing surface integrals, however, the surfaces 
must be considered oriented surfaces and one has to insert a negative sign in 
the differential of surface area component in the xz-plane as shown later in 
equation 2.83. 


2.2.5 Determinants 


The properties of n-forms are closely related to determinants, so it might be 
helpful to digress a bit and review the fundamentals of determinants, as found 


2.2. TENSORS 47 


in any standard linear algebra textbook such as [16]. Let A € M, be ann x n 
matrix with column vectors 


A = [V1,V2,---Vn| 


2.2.13 Definition A function f : Mn > R is called multilinear if it is linear 
on each slot; that is, 


flvi,..., @1vitaev;,...,Vn] =aif[vi,...,Vi,---, Vn] +aef[vi,...,Vj,--+5 Vn]. 


2.2.14 Definition A function f : Mn —> R is called alternating if it changes 
sign whenever any two columns are switched; that is, 


Fae Sg Veh ig VA Ve = SPV aie henry Waa] 


2.2.15 Definition A determinant function is a map D : Mn > R that is 
a) Multilinear, 
b) Alternating, 
c) D(I) = 1. 


One can then prove that this defines the function uniquely. In particular, if 
A = (atj), the determinant can be expressed as 


where the sum is over all the permutations of {1,2...,n}. The determinant 
can also be calculated by the cofactor expansion formula of Laplace. Thus, for 
example, the cofactor expansion along the entries on the first row (a',), is given 
by 
det(A) = X` atA", (2.40) 
k 


where A is the cofactor matrix. 
At this point it is convenient to introduce the totally antisymmetric Levi-Civita 
permutation symbol defined as follows: 


+1 if (¢1,%2,...%,) is an even permutation of (1, 2,..., k) 
€izio.i, = $ —1 if(i1,i2,...,iķ) is an odd permutation of (1,2,..., k) 
0 otherwise 
(2.41) 
In dimension 3, there are only 6 (3! = 6) non-vanishing components of €ijk, 


namely, 


€123 = €231 = €312 = 1 
€132 = €213 = €321 = — 1 (2.42) 


48 CHAPTER 2. DIFFERENTIAL FORMS 


We set the Levi-Civita symbol with some or all the indices up, numerically 
equal to the permutation symbol will all the indices down. The permutation 
symbols are useful in the theory of determinants. In fact, if A = (atj) is an 
n x n matrix, then, equation (2.39) can be written as, 

det A = |A| See a ae sto (2.43) 


2 tn 


Thus, for example, for a 2 x 2 matrix, 


1 
ay, a2 

A = 2 2 ) 
ay a2 
det(A) = 1a! ;a?;, 


= eq! a? a+ elaloa?1, 


= a 403 a aloa°1. 
We also introduce the generalized Kronecker delta symbol 


+1 if (41, %2,...,%,) is an even permutation of (j1, jo,.--, jk) 
Cree ia =< —1 if (t41,%2,...,%) is an odd permutation of (j1, J2,---, jk) 
0 otherwise 
(2.44) 
If one views the indices ią as labelling rows and jẹ as labelling columns of a 
matrix, we can represent the completely antisymmetric symbol by the determi- 
nant, 


i1 ju t1 
Jı J2 Jk 
12 12 12 
ges tk = OF OF aoe Onn (2.45) 
1J2-++-Jk 
Uk Uk tk 
On, OF Ot 


Not surprisingly, the generalized Kronecker delta is related to a product of 
Levi-Civita symbols by the equation 


et1i2 tk 23 Giit2--tk (2.46) 


tke... z 
Cj joj T Ojija.jr? 


which is evident since both sides are completely antisymmetric. In dimension 
3, the only non-zero components of 5) are, 


513 = 013 = 033 = 1 53} = 63] = 033 = —1 
531 = 03] = 033 = 1 ôi? = 633 = 033 = —1. 


2.2.16 Proposition In dimension 3 the following identities hold 
a) Jk = oik = 8i ôE — Siok, 


b) eijn = 20h 
c) eip = 3! 


2.2. TENSORS 49 


Proof For part (a), we compute the determinant by cofactor expansion on the 
first row 


5; Sim On 
eee = jg gi gi 
la 52] _ si [OF OA), gi [OP Oh, 
Flee ok] Om lak akl TIn lok ak 
w 160, BE 18k, 
=3 15k gk — [ge 5H) Tlak gk 
Si. 8F\ Net. a 
= (3-1-1) 5k 5k a 5k 5k 


Here we used the fact that the contraction 5! is just the trace of the identity 
matrix and the observation that we had to transpose columns in the last deter- 
minant in the next to last line. for part (b) follows easily for part(a), namely, 
Cy = Ole 
j ck i ck 
= 6 én — 67,05 ; 
= 36" — 6", 
= 28%. 


From this, part (c) is obvious. With considerably more effort, but inductively 
following the same scheme, one can establish the general formula, 


Skone (2.47) 


trsitkitk tinta $ 
2 S Jk+1- Jn’ 


11..-tk Jkt1--Jn 


2.2.6 Vector Identities 


The permutation symbols are very useful in establishing and manipulating 
classical vector formulas. We present here a number of examples. For this 
purpose, let, 


a = aii + aoj + ask, a = a, dx! + ag dz? + az dz?, 
b = bii + boj + d3k, P B = bı dx! + bz dz? + bs dz’, 
c = Gi + c2j + c3k, y = c dz! + cy dx? + c3 dz’, 
d = dii + d2j + dsk, 6 = dı dz! + dy dx? + dz dz?, 


1. Dot product and cross product 


a- b= Taib; = a;b, (a x b); = ck” aid; (2.48) 


2. Wedge product 
aN b= Vij (a x b)x dx’ A dx’. (2.49) 


50 CHAPTER 2. DIFFERENTIAL FORMS 


3. Triple product 


a.: (b x c) =6,;a'(b x c)', 


= ijd pbc, 
= eima b"c, 

a- (b x c) = det([abc}), (2.50) 
= (axb). c (2.51) 


4. Triple cross product: bac-cab identity 


[a x (b x c)j = €” am(b x c)n 
= eam (En? bjek) 
= e” en ambjck) 
= emnit” ab; Ck) 
= (5% 5} — 536% \a™bjcy 
= biad” Cm — cra” bm. 

Rewriting in vector form 
ax (b x c) = b(a- c) — c(a - b). (2.52) 
5. Dot product of cross products 


(ax b)- (c xd) =a. (bx cxd), 
=a-|c(b-d) —d(b-c)] 
= (a: c)(b - d) — (a-d)(b-¢), 


(ax b)- (c xd) = (2.53) 


b-c b-d 


a-c a 


6. Norm of cross-product 


a x bl? = (a x b) - (a x b), 


a-a a-b 
b-a b-b, 


= ||aļl’ [b|| — (a- b)? (2.54) 


? 


7. More wedge products. Let C = Er, D = d” z2. Then, 
a(C) a(D) 
D zi 


(2A PED S 60) 6(D) 


(2.55) 


2.2. TENSORS 51 


8. Grad, Curl, Div in R? 
Let V; = 54, Vİ = ôV j, A =a and define 
(Vf = Vif 
o (V x A); =F" V jax 
o V-A = 0i Viaj = Viaj 
o V- Vf) = V?f = ViVif 


© 


(a) 
(V x Vf)i = 2 *ViV 5 = 9, 
Vx Vf=0 (2.56) 
(b) 
V. (V x A) = iV (V x a)j, 
= ôI Vie! Vka, 
= JM jap, 
V-(Vx A)=0 (2.57) 


where in the last step in the two items above we use the fact that 
a contraction of two symmetric with two antisymmetric indices is 
always 0. 


(c) The same steps as in the bac-cab identity give 


[V x (V x A)]. = Vi(V™am) — V” Vma, 
Vx(VxA)=V(V-A)—V7A, 


where V?A means the Laplacian of each component of A. 


This last equation is crucial in the derivation of the wave equation for 
light from Maxwell’s equations for the electromagnetic field. 


2.2.7 n-Forms 


2.2.17 Definition Let a',a’*,a’, be one forms, and X1, X2, X3 E€ 2. Let 
m be the set of permutations of {1,2,3}. Then 


(at Aa? A a%)(X1, X2, X3) = X sign(m)a! (Xpay)a?(Xp(2))07(Xn(3))s 
= eka! (X;)a?(X;)a? (Xp). 


This trilinear map is an example of a alternating covariant 3-tensor. 


52 CHAPTER 2. DIFFERENTIAL FORMS 


2.2.18 Definition A 3-form ¢ in R” is an alternating, covariant 3-tensor. 
In local coordinates, a 3-from can be written as an object of the following type 


$ = Aijxda’ A dz? ^da" (2.58) 


where we assume that the wedge product of three 1-forms is associative but 
alternating in the sense that if one switches any two differentials, then the 
entire expression changes by a minus sign. There is nothing really wrong with 
using definition (2.58). This definition however, is coordinate-dependent and 
differential geometers prefer coordinate-free definitions, theorems and proofs. 
We can easily extend the concepts above to higher order forms. 


2.2.19 Definition Let T?(R”) be the set multilinear maps 


t:7T(R)x...xT(R) OR 
—— ama 
k times 


from k copies of T(R) to R. The map t is called skew-symmetric if 
t(e1,..-,€n) = sign(m)t(en(1), +++, €n(k))s (2.59) 


where 7 is the set of permutations of {1,...,k}. A skew-symmetry covariant 
tensor of rank k at p, is called a k-form at p. denote by Af, R”) the space of 
k-forms at p € R”. This vector space has dimension 


dim A(R”) = @ 7 ea 


for k < n and dimension 0 for k > n. We identify Ap) (R”) with the space of 


C™ functions at p. The union of all AG, (R”) as p ranges through all points in 
R” is called the bundle of k-forms and will be denoted by 


AF (R”) = [J A‘(R”). 


Sections of the bundle are called k-forms and the space of all sections is denoted 
by 

Q* (R?) =T(A*(R”)). 
A section a € Q* of the bundle technically should be called k-form field, but 


the consensus in the literature is to call such a section simply a k-form. In local 
coordinates, a k-form can be written as 


a = Åi, ip (£)d£" A...da™®. (2.60) 


2.2.20 Definition The alternation map A: T? (R”) + T}? (R”) is defined by 


1 . 
At(e,,...,€%) = A S\(signm)t(en(1); tiesto 


2.2. TENSORS 53 


2.2.21 Definition If a¢*(R”) and 8 € 0!(R”), then 


k+I)! 
ee aa  A(a® 8) (2.61) 
If a is a k-form and £ an l-form, we have 
aN B=(-1)' Baa. (2.62) 


Now, for a little combinatorics. Factorials are unavoidable due to the permu- 
tation attributes of the wedge product. The convention here follows Marsden 
[20] and Spivak [34], which reduces proliferation of factorials later. Let us count 
the number of linearly independent differential forms in Euclidean space. More 
specifically, we want to find a basis for the vector space of k-forms in R3. As 
stated above, we will think of 0-forms as being ordinary functions. Since func- 
tions are the “scalars”, the space of 0-forms as a vector space has dimension 
1. 


R? Forms Dim 
0-forms f 1 
1-forms | fdx!, gdz? | 2 
2-forms | fdz! A dx? | 1 


R3 Forms Dim 
0-forms f 1 
1-forms fidz', fodx?, fadx° 3 
2-forms | fidx? A dx*, fodx®? A dx!, fdz! A dx? | 3 
3-forms fidx! A dx? A dz? 1 


The binomial coefficient pattern should be evident to the reader. 
It is possible define tensor-valued differential forms. Let E = T!(R™) be the 
tensor bundle. A tensor-valued p-form is defined as a section 
T € O?(R", E) =T(E 8 A?(R")). 
In local coordinates, a tensor-valued k-form is a ye tensor 
Bee. o 

T = AEE S N 

(2.63) 
Thus, for example, the quantity 

Qj = 4 Rf jx da" A da' 

would be called the components of a ({)-valued 2-form 
O : 
The notion of the wedge product can be extended to tensor-valued forms using 


tensor products on the tensorial indices and wedge products on the differential 
form indices. 


Q=Q 


54 CHAPTER 2. DIFFERENTIAL FORMS 


2.3 Exterior Derivatives 


In this section we introduce a differential operator that generalizes the clas- 
sical gradient, curl and divergence operators. 


2.3.1 Definition Let a be a one form in R”. The differential da is the 
two-form defined by 


da(X,Y) = X(a(Y)) —Y(a(X)), (2.64) 


for any pair of vector fields X and Y. 
To explore the meaning of this definition in local coordinates, let a = f;dx* 


and let X = sor Y= ir then 
o il 9 o ,{ 9 
da(X,Y) = Oat fade (=) - Jak fide (3)| ; 
o ; o 
= Oxi (fiô) ark (fi55), 


d ð ð \ fk fi 
"\ Gai’ Əxi) dei aa* 


Therefore, taking into account the antisymmetry of wedge products, we have. 


= 5 (G-L) wo? naa 
2 ? 


ae Oxi  ðxě 


The definition 2.64 of a differential of a 1-form can be refined to provide 
a coordinate-free definition in general manifolds (see 6.28,) and it can be ex- 
tended to differentials of m-forms. For now, the computation immediately 
above suffices to motivate the following coordinate dependent definition (for a 
coordinate-free definition for general manifolds, see (7.17): 


2.3.2 Definition Let a be an m-form, given in coordinates as in equa- 
tion (2.60). The exterior derivative of a is the (m + 1)-form da given by 


da = dAj, in, A dx” ...dx'™ 
= Baio da Neda cde. (2.65) 


In the special case where a is a 0-form, that is, a function, we write 


= OL gk 


2.3. EXTERIOR DERIVATIVES 55 


2.3.3 Theorem 


a) d:Q™ — Q+ 
b) d=dod=0 
c) d(aAB)=daN8+(-1)PPaAdB Vac, BERI (2.66) 


Proof 
a) Obvious from equation (2.65). 
b) First, we prove the proposition for a = f € 2°. We have 


Of ) 

Ox? 
Eain 

7 t Of 8f 

~ 2 ðxiðxrİ Əxrİðrİ 
0. 


d(da) 


II 


d( 


]dx? ^ da’ 


Now, suppose that a is represented locally as in equation (2.60). It follows from 
equation 2.65, that 


d(da) = d(dAj,,..i,,) A dx” A dz” ...dz'm = 0. 
c) Let a € QP, 8 € Q1. Then, we can write 


a = Ap cde Acar 
p= B; „j (£)dr” A...dxri9, 


(2.67) 
By definition, 


aN B = Ai ip Bh. jld A... Adat) A (de® A... A dx"). 


Now, we take the exterior derivative of the last equation, taking into account 
that d( fg) = fdg + gdf for any functions f and g. We get 


d(aœ ^ p) = |d Age a) Binge (Aga) a Bag) 
(da A... Ad?) A (da?! A... A dxs) 
= [dAn i, A (da A... Adz?) A [Bj j, A (d£? A... A dx: )]+ 
[Ani A (da*? A... A dz? )] A (—1)?[dBj,...5, A (dx A... A dx) 
= da A B + (—1)Pa ^ dB. 


The (—1)? factor comes into play since in order to pass the term dB;;...j, through 
p number of 1-forms of type dx’, one must perform p transpositions. 


56 CHAPTER 2. DIFFERENTIAL FORMS 


2.3.1 Pull-back 


2.3.4 Definition Let F : R” —> R” be a differentiable mapping and let a 
be a k-form in R™. Then, at each point y € R™ with y = F(a), the mapping 
F induces a map called the pull-back F* : Ofre) > OQ) defined by 


(F*a)e(X1,.-- Xk) = ope) (FEX, -Fe Xh), (2.68) 
for any tangent vectors {X1,...X,} in R”. 
If g is a 0-form, namely a function, F*(g) = go F. We have the following 


theorem. 


2.3.5 Theorem 


a) F*(gai1) = (go F)F*a, 

b) F*(ay +a) = Fxea,t F*ag, (2.69) 
c) F*(anb) = F*anF*ß, ` 
d) F*(da) = d(F*a.) 


Part (d) is encapsulated in the commuting diagram in figure 2.2. 


O*(R”) E Q*(R™) 


ji |e 


QF+H(R”) ; F* QH (R™) 


Fig. 2.2: d F* = F* d 


Proof Part (a) is basically the definition for the case of 0-forms and part (b) 
is clear from the linearity of the push-forward. We leave part (c) as an exercise 
and prove part (d). In the case of a 0-form, let g, be a function and X a vector 
field in R™. By a simple computation that amounts to recycling definitions, 
we have: 


d(F"g) = d(go F), 
(F"dg)(X) = dg(F,. X) = (F.X)(9), 
= X(go0 F) =d(go F)(X), 
F"dg = d(go F), 
so, F*(dg) = d(F*g) is true by the composite mapping theorem. Let a be a 


k-form 
a = Aj, i, dy A- dy, 


so that 
da = (dAj,,...i,) A dy"... dy™. 


2.3. EXTERIOR DERIVATIVES 57 


Then, by part (c), 
F*a = (F* Ai ip) F*dy” A... F*dy*, 
d(F*a) = dF*(Aj,,..i,) A F* dy" A... F*dy*, 
= F* (da). 
So again, the result rests on the chain rule. 


To connect with advanced calculus, suppose that locally the mapping F is 
given by y* = f*(x"). Then the pullback of the form dg given the formula 
above F*dg = d(g o F) is given in local coordinates by the chain rule 


O9 j 
*dq = j 
F*dg = zde : 
In particular, the pull-back of local coordinate functions is given by 


F*(dy’) = OW ae! l (2.70) 


Thus, pullback for the basis 1-forms dy* is yet another manifestation of the 


differential as a linear map represented by the Jacobian 


_ ôy" 
~~ Axt 


dy* dz’. (2.71) 


In particular, if m = n, 


dQ = dy! A dy? A... A dy”, 
Oy! Oy? Oy x ; l 
= - Sku IA dx’? \...dx'" 
Bri Oe Dae OE 
Oy’ Oy? ƏY” an2 
- —... = d d ee SAE 
Axi ðr "gin 7 ee, 


= |J| A dz! A... dz”. (2.72) 


— eitz in 


So, the pull-back of the volume form, 
F*dQ = |J| dz’ ^... Adz”, 


gives rise to the integrand that appears in the change of variables theorem for 
integration. More explicitly, let R € R” be a simply connected region, F be a 
mapping F : R € R” > R”, with m > n. If w is a k- form in R”, then 


j w= f F*w (2.73) 
F(R) R 


We refer to this formulation of the change of variables theorem as integration 
by pull-back. 


58 CHAPTER 2. DIFFERENTIAL FORMS 


If F: R” > R” is a diffeomorphism, one can push-forward forms with the 
inverse of the pull-back F, = (F71)*. 


2.3.6 Example Line Integrals 

Let w = fi dx? be a one form in R? and let C be the curve given by the mapping 
@:1=té [a,b] > x(t) € R?. We can write w = F - dx, where F = (fi, fo, fs) 
is a vector field. Then the integration by pull-back equation 2.73 reads, 


This coincides with the definition of line integrals as introduced in calculus. 


2.3.7 Example Polar Coordinates 
Let x = rcos@ and y=rsin@ and f = f(x,y). Then 


dx \dy = (-rsin@dé + cos 0dr) A (r cos d0 + sin 6dr), 
= —rsin? 6d6 ^ dr + r cos? 6dr ^ dd, 
= (rcos?0 + rsin? 0)(dr A dé), 
= r(dr A^ dð). 


J | tena Ady = EG (r,0),y(r,0)) r(dr A dé). (2.74) 
In this case, the element of arc length is diagonal 
ds? = dr? + r*d6", 

as it should be for an orthogonal change of variables. The differential of area is 


\/det g dr A dé, 


= r(dr ^ dé) 


dA 


II 


If the polar coordinates map is denoted by F : R? + R?, then equation 2.74 is 
just the explicit expression for the pullback of F*( f dA). 


2.3.8 Example Polar coordinates are just a special example of the general 


2.3. EXTERIOR DERIVATIVES 59 


transformation in R? given by, 


x = z(u,v), t= a u+ Du dv, 
o 
y = y(u, v), dy = dut Yaw, 
for which 
Ox Ox 
b* (dx A dy) = au oy du ^ dv (2.75) 


2.3.9 Example Surface Integrals 
Let R € R? be a simply connected region with boundary ôR and let the mapping 


@: (u,v) E€ R — x(u%) € R? 


describe : surface S with boundary C = ¢(dR). Here, a = 1,2, with u = 
1 


ut, v=u?. Given a vector field F = (f1, fo, f3), we assign to it the 2-form 
=F. d9, 
= fı dz? Adr? — fo dz! A dx? + fs dz! A da’, 
= ép fi dx’ A da". 

Then, 


We elaborate a bit on this slick computation, for the benefit of those readers 
who may have gotten got lost in the index manipulation. 


J ma J he 


1 x(dx? A dx?) — fo bx(dx' Adz?) + fz ox(dx* A dx’) , 


Oa? Oa? Oat oe ae dat 
Ou Ov | _ Ou Ou ðv 
8x3 Əz? fo 8x3 + fsla x? əx? du ^ dv 
ðu Ov Ou ue ee ðv 


60 CHAPTER 2. DIFFERENTIAL FORMS 


This pull-back formula for surface integrals is how most students are introduced 
to this subject in the third semester of calculus. 


2.3.10 Remark 


1. The differential of area in polar coordinates is of course a special example 
of the change of coordinate theorem for multiple integrals as indicated 
above. 


2. As shown in equation 2.32 the metric in spherical coordinates is given by 
ds? = dr? + r° d0? + r? sin? 6 d¢?, 
so the differential of volume is 


dV = ydet g dr A dé A dọ, 
= r?°sin ð dr ^ dé A dd. 


2.3.2 Stokes’ Theorem in R” 
Let a = P(x,y) dx + Q(z, y)dy. Then, 


da = (Ædr4 oP) A da | (32de + $2) A dy 
= oF dy ^ da + $2 da A dy 
= (J - 2E) dr Ady. (2.76) 


This example is related to Green’s theorem in R?. For convenience, we include 
here a proof of Green’s Theorem in a special case. We say that a region D 


y = fil) y o 
3 


Fig. 2.3: Simple closed curve. 


in the plane is of type I if it is enclosed between the graphs of two continuous 
functions of x. The region inside the simple closed curve in figure 2.3 bounded 
by fı(x) and f(x), between a and b, is a region of type I. A region in the plane 
is of type IT if it lies between two continuous functions of y. The region in 2.3 
bounded between c < y < d, would be a region of type II. 


2.3.11 Green’s theorem 


2.3. EXTERIOR DERIVATIVES 61 


Let C be a simple closed curve in the xy-plane and let 0P/Ox and 0Q/Ody be 
continuous functions of (x,y) inside and on C. Let R be the region inside the 
closed curve so that the boundary 6R = C. Then 


prema | Pe a a 


We first prove that for a type I region such as the one bounded between a and 


b shown in 2.3, we have 
OP 
Pav=-f | 5 aa 2.78 
£ p Oy l ) 


Where C comprises the curves C1, C2, C3 and Cy. By the fundamental theorem 
of calculus, we have on the right, 


ðP o (PO OP 
J ay t= j, [. ape 
b 
= f [P(e, fa(2)) — P(e, fi(2))] dz. 


On the left, the integrals along Cy and C4 vanish, since there is no variation on 
x. The integral along C3 is traversed in opposite direction of C1, so we have, 


$ Ploy) ax = f +f +f + P(x,y) dz, 
c Cı CS SC ICs 


= P(x,y) dx -f P(x,y) dz, 


Cı C3 
b b 
m) P(x, fi(x)) ax — | P(x, fo(x)) dx 


This establishes the veracity of equation 2.78 for type I regions. By a completely 
analogous process on type II regions, we find that 


pow ff aa (2.79) 


The theorem follows by subdividing R into a grid of regions of both types, all 
oriented in the same direction as shown on the right in figure 2.3. Then one 
applies equations 2.78 or 2.79, as appropriate, for each of the subdomains. All 
contributions from internal boundaries cancel since each is traversed twice, each 
in opposite directions. All that remains of the line integrals is the contribution 
along the boundary 6R. 

Let a = P dx + Q dy. Comparing with equation 2.76, we can write Green’s 


theorem in the form 
I a= F da. (2.80) 
Cc D 


62 CHAPTER 2. DIFFERENTIAL FORMS 


It is possible to extend Green’s Theorem to more complicated regions that are 
not simple connected. Green’s theorem is a special case in dimension of two of 
Stoke’s theorem. 


2.3.12 Stokes’ theorem 


If w is a Ct one form in R” and S is C? surface with boundary 6S = C, then 


f=] few (2.81) 


Proof The proof can be done by pulling back to the uv-plane and using the 
chain rule, thus allowing us to use Green’s theorem. Let w = f; dx’ and S be 
parametrized by zê = x'(u%), where (u',u?) € R C R?. We assume that the 
boundary of R is a simple closed curve. Then 


ðr! 
= ins du”, 
[ Ou® 
o Ox' 
= —~( fi —— B a 
ioe 5,8 fig, a) tu A du”, 
= Of; Ox* Ox' ri i . 
=f ff E Ou? dum T haa du” A du ; 
Of; Ox! Ox* ‘ 
z Ere du? A du”, 
ofi Ox uf Ox? i 
=f i, | soe a ^ Ea du 
-JJZ Of. iyt A dz’ = J fatienae’ 
=f f aw. 
Ss 


We present a less intuitive but far more elegant proof. The idea is formally 
the same, namely, we pull-back to the plane by formula 2.73, apply Green’s 
theorem in the form given in equation 2.80, and then use the fact that the 
pull-back commutes with the differential as in theorem 2.69. 


Let 6: R C R? > S denote the surface parametrization map. Assume that 
@ (58) = 6(¢-1S), that is, the inverse of the boundary of S$ is the boundary 
of the domain R. Then, 


2.3. EXTERIOR DERIVATIVES 63 


he? T a me i a 
= f {ae 
= f {eo 
= f w. 


The proof of Stokes’ theorem presented here is one of those cases mentioned in 
the preface, where we have simplified the mathematics for the sake of clarity. 
Among other things, a rigorous proof requires one to quantify what is meant by 
the boundary (ôS) of a region. The process involves either introducing simplices 
(generalized segments, triangles, tetrahedra...) or singular cubes (generalized 
segments, rectangles, cubes...). The former are preferred in the treatment of 
homology in algebraic topology, but the latter are more natural to use in the 
context of integration on manifolds with boundary. A singular n-cube in R” is 
the image under a continuous map, 


I”: [0,1]? > R”, 


of the Cartesian product of n copies of the unit interval [0,1]. The idea is 
to divide the region S into formal finite sums of singular cubes, called chains. 
One then introduces a boundary operator 6, that maps a singular n-cube and 
hence n-chain, into an (n — 1)-singular cube or (n — 1)-chain. Thus, in R3 for 
example, the boundary of a cube, is the sum 5>c¢;F; of the six faces with a 
judicious choice of coefficients c; € {—1,1}. With an appropriate scheme to 
label faces of singular cube and a corresponding definition of the boundary 
map, one proves that ôo ô = 0. For a thorough treatment, see the beautiful 
book Calculus on Manifolds by M. Spivak [33]. 


Closed and Exact forms 


2.3.13 Example Let a = M(xz,y)dx + N(x, y)dy, and suppose that da = 0. 
Then, by the previous example, 


da = (SY — 2) dz A dy. 


Thus, da = 0 iff Ns = M}, which implies that N = fy and M, for some 
function f(x,y). Hence, 


a= f,dx+ fy dy = df. 


The reader should also be familiar with this example in the context of exact 
differential equations of first order and conservative force fields. 


2.3.14 Definition A differential form a is called closed if da = 0. 


64 CHAPTER 2. DIFFERENTIAL FORMS 


2.3.15 Definition A differential form a is called exact if there exists a form 
6 such that a = dp. 


Since dod = 0, it is clear that an exact form is also closed. The converse need 
not be true. The standard counterexample is the form, 


-ydx+ad 
= TFRS (2.82) 
A short computation shows that dw = 0, so w is closed. Let 6 = tan™!(y/x) be 
the angle in polar coordinates. One can recognize that w = dé, but this is only 
true in R? — L, where L is the non-negative x-axis, L = {(x,0) € R?|x > 0}. 
If one computes the line integral from (—1, 0) to (1,0) along the top half of the 
unit circle, the result is m. But the line integral along the bottom half of the 
unit circle gives —7. The integral is therefore not path independent, so w 4 dé 
on any region that contains the origin. If one tries to find another C? function f 
such that w = df, one can easily show that f = 0 + const, which is not possible 
along L. 

On the other hand, if one imposes the topological condition that the space 
is contractible, then the statement is true. A contractible space is one that can 
be deformed continuously to an interior point. We have the following, 


2.3.16 Poincaré Lemma. In a contractible space (such as R” ), if a differential 
form is closed, then it is exact. 

To prove this lemma we need much more machinery than we have available 
at this point. We present the proof in 7.1.17. 


2.4 The Hodge x Operator 


2.4.1 Dual Forms 


An important lesson students learn in linear algebra, is that all vector spaces 
of finite dimension n are isomorphic to each other. Thus, for instance, the space 
P3 of all real polynomials in x of degree 3, and the space Moy. of real 2 by 
2 matrices are, in terms of their vector space properties, basically no different 
from the Euclidean vector space R4. As a good example of this, consider the 
tangent space T,R°. The process of replacing 2 by i, T by j and & by k 
is a linear, 1-1 and onto map that sends the “vector” part of a tangent vector 
a! 2 Ha’ A + a È to a regular Euclidean vector (a!, a?, a3). 

We have also observed that the tangent space 7,R” is isomorphic to the 
cotangent space T7R” . In this case, the vector space isomorphism maps the 


standard basis vectors {2} to their duals {dx}. This isomorphism then trans- 
forms a contravariant vector to a covariant vector. In terms of components, the 
isomorphism is provided by the Euclidean metric that maps the components of 
a contravariant vector with indices up to a covariant vector with indices down. 

Another interesting example is provided by the spaces A} (R?) and AZ(R°), 
both of which have dimension 3. It follows that these two spaces must be 


2.4. THE HODGE x OPERATOR 65 


isomorphic. In this case the isomorphism is given as follows: 


dx —> dyAdz 
dy ++ -dr Adz 
dz +> dxAdy 
(2.83) 


More generally, we have seen that the dimension of the space of k-forms in 
R” is given by the binomial coefficient Gos Since 


it must be true that 
k œ an-k 
AGRE Y= AR"). (2.84) 
To describe the isomorphism between these two spaces, we introduce the fol- 
lowing generalization of determinants, 


2.4.1 Definition . Let ¢: R” — R” bea linear map. The unique constant 
det ¢ such that, 
ox: A"(R”) > A"(R”) 
satisfies, 
gw = (det 9) w, (2.85) 


for all n-forms, is called the determinant of ¢. This is congruent with the 
standard linear algebra formula 2.43, since in a particular basis, the Jacobian 
of a linear map is the same as the matrix the represents the linear map in that 
basis. Let, g(X,Y) be an inner product and {e1,..., €n} be an orthonormal 
basis with dual forms {6',...6"}. The element of arc length is, the bilinear 
symmetric tensor l 

ds? = Jij 0 ® 0. 
The metric then induces an n-form 

dQ = 0t A8... A0”, 


called the volume element. With this choice of form, the reader will recognize 
equation 2.85 as the integrand in the change of variables theorem for multiple 
integration, as in example 2.74. More generally, if {f1,... fn} is a positively 
oriented basis with dual basis {¢',...¢"}, then, 


dQ = /detg ¢' A...A 0". (2.86) 


2.4.2 Definition Let g be the matrix representing the components of the 
metric in R”. The Hodge x operator is the linear isomorphism x : A% (R”) —> 
An-*(R") defined in standard local coordinates by the equation, 

vdet g eik 


* (da NaN da**) = (n— k)! ae Wel TA da, (2.87) 


66 CHAPTER 2. DIFFERENTIAL FORMS 


For flat Euclidean space ydet g = 1, so the factor in the definition may appear 
superfluous. However, when we consider more general Riemannian manifolds, 
we will have to be more careful with raising and lowering indices with the metric, 
and take into account that the Levi-Civita symbol is not a tensor but something 
slightly more complicated called a tensor density. Including the ydet g is done 
in anticipation of this more general setting later. Since the forms dx"! A...Adx** 
constitute a basis of the vector space A}(R”) and the x operator is assumed to 
be a linear map, equation (2.87) completely specifies the map for all k-forms. 
In particular, if the components of a dual of a form are equal to the components 
of the form, the tensor is called self-dual. Of course, this can only happen if 
the tensor and its dual are of the same rank. 

A metric g on R” induces an inner product on A*(R”) as follows. Let 
{e1,,...€n} by an orthonormal basis with dual basis 6',...,0". If a,ß € 
A*(R"), we can write 


a= Qii...ik OHN, EE Ore , 
B= 05, in BIA, 03" 
The induced inner product is defined by 


1 by 8 
<a, B >= Fl tia (2.88) 


If a, 6B € A(R”), then «6 € A"-*(R”), so aA x6 must be a multiple of the 
volume form. The Hodge x operator is the unique isomorphism such that 


ax =< a,b > dQ. (2.89) 
Clearly, 
aNhxB=xa Bb 


When it is evident that the inner product is the induced inner product on 
A*(R®™) the indicator (k) is often suppressed. An equivalent definition of the 
induced inner product of two k-forms is given by 


<a,B >= fo A xB) dQ. (2.90) 


If a is a k-form and £ is a (k — 1)-form, one can define the adjoint or co- 
differential by 
<da,8>=<a,d8>. (2.91) 


The adjoint is given by 
6 = (-1) "8! dx. (2.92) 


In particular, 


af E if n is even (2.93) 


\Ex dx ifn is odd 


The differential maps (k — 1)-forms to k-forms, and the co-differential maps 
k-forms to (k — 1)-forms. It is also the case that ô o ô = 0 The combination, 


A = (d+6)? = dô + 6d (2.94) 


2.4. THE HODGE x OPERATOR 67 


extends the Laplacian operator to forms. It maps k-forms to k-forms. A central 
result in harmonic analysis is the Hodge decomposition theorem, that states that 
given any k-form w, can be split uniquely as 


w = da +ôp +y, (2.95) 
where a € NF-!, BE OF and Ay = 0 
2.4.3 Example Hodge operator in R? 


In R?, 
xdxz = dy x dy = —dz, 


or, if one thinks of a matrix representation of x : Q(R?) > Q(R?) in standard 
basis, we can write the above as 


y dx) |O 1j |dax 
dy| |—1 0J |dy|` 
The reader might wish to peek at the symplectic matrix 5.50 in the discussion in 


chapter 5 on conformal mappings. Given functions u = u(x, y) and v = v(2,y), 
let w = u dx — v dy. Then, 


dw = —(uy + vz) dx A dy, SR dw = 0 > uy = —vz, 
n 


(2.96) 
dxw= (ty, —U,) dx A dy, xdw = 0 => Ug = vy. 


Thus, the equations dw = 0 and dxw = 0 are equivalent to the Cauchy-Riemann 
equations for a holomorphic function f(z) = u(x, y) + iv(x, y). On the other 
hand, 


du = Uz dz + Uy dy, 
dv = vz dx + ty dy, 


so the determinant of the Jacobian of the transformation T : (x,y) > (u,v)), 
with the condition above on w, is given by, 


| J| = =u? +u? = vi + o2. 


Ug Uy 
Uz v 


If |J| 4 0, we can set us = Reos¢, uy = Rsinọġ, for some R and some angle 
og. Then, 


coso sino 


—singd cos¢|- 


R 0 


Thus, the transformation is given by the composition of a dilation and a ro- 
tation. A more thorough discussion of this topic is found in the section of 
conformal maps in chapter 5. 


68 CHAPTER 2. DIFFERENTIAL FORMS 


2.4.4 Example Hodge operator in R3 
xdr! = €! jeda A da”, 

ale o3dz” A dz? + e z9dx° A dx”), 

= syle? A dz? — dx? A dz?], 

= syle? A dz? + dx? A dz}, 

= dz? Adz’. 


We leave it to the reader to complete the computation of the action of the x 
operator on the other basis forms. The results are 


«dx! = +dx? A dz”, 
«dz? = —dz' A dz’, 


xdx* = +dx' A da’, (2.97) 
x(dz? Adz?) = dat, 
x(—dz? Adz!) = dz’, 
x(dz! Adz?) = dz’, (2.98) 
and 
x (dz! A dz? \ dz?) = 1. (2.99) 
In particular, if f : R? — R is any 0-form (a function), then, 
xf = f(dx' Ada? Adz), 
= fav, (2.100) 


where dV is the volume form. 


2.4.5 Example Let a = a,dz'agdx? + agdx?, and 8 = b,dx'bodx? + b3dz?. 
Then, 


x(a AB) = (agb3 — agb2) x (dx? A dz?) + (a1b3 — agbı) x (dx! A dz?) + 
(aiba — azb,) x (dx A dx”), 

(agb3 — aa + (ayb3 — a3b,)dx? + (ayb2 — azbı)dz?, 

(a x b); d (2.101) 


The previous examples provide some nen on the action of the ^ and x opera- 
tors. If one thinks of the quantities dx, dx? and dz? as analogous to i, j j and k, 
then it should be apparent that e (2.97) are the differential geometry 
versions of the well-known relations 


i= jxk, 
j —i xk, 
k = ixj. 


II 


2.4. THE HODGE x OPERATOR 69 


2.4.6 Example In Minkowski space the collection of all 2-forms has dimen- 
sion G )= = 6. The Hodge x operator in this case splits Q9? (M13) into two 3-dim 
subspaces 03, such that x : Q} — Q2. 
More specifically, 0%. is spanned by the forms {de Ada, dx? A da?, dx? A dz}, 
and Q? is spanned Py the forms {dx? A dz?, —dx' A dz?, aa A da}. The action 
of x on 93. is 


x(dz® Adz!) = $6, dr" Adz! = —dx? Adz’, 
x(dx? Adz?) = 4? pdr" Adel = pda Adz’, 
x(da° Adz?) = 46) dx" Adz! = —da’ A dz’, 


and on 92, 
x(+da* Adz?) = 46 pdx" Adz! = da? Adat, 
x(—da' Adz?) = 4e! a Adz! = da? Adz’, 
*(+da* A dx?) = sel? dak Adz! = dr? Adz’. 


In verifying the equations above, we recall that the Levi-Civita symbols that 
contain an index with value 0 in me up position have an extra minus sign as 
a result of raising the index with 7%. If F € 0?(M), we will formally write 
F = F} + F_, where Fy € 22. We would like to note that the Boon of the 
dual operator on 2(M) is such that x : Q2(M) — 02(M), and x? = —1. na 
vector space a map like x, with the property x? = —1 is called a linear involution 
of the space. In the case in question, 2% are the eigenspaces corresponding to 
the +1 and -1 eigenvalues of this involution. It is also worthwhile to calculate 
the duals of 1-forms in M3. The results are, 


xdt = —dzx' A dx? A dz’, 
xdz! = +dz? Adt Adz’, 
xdr? = 4+dt Adz‘ Adz’, 
xdz? = +dax' A dt Adz’. (2.102) 


2.4.2 Laplacian 


Classical differential operators that enter in Green’s and Stokes’ theorems 
are better understood as special manifestations of the exterior differential and 
the Hodge x operators in R8. Here is precisely how this works: 


1. Let f : Rè — R be a C% function. Then 


0 > 
df = oo as = Vf - dx. (2.103) 


2. Let a = A;dx’ be a 1-form in RÌ. Then 


1,0A; OA; 
Cae 2 Bai — Oxi 


| +) x (da ^ dx’) 
= (VxA)- dS. (2.104) 


70 CHAPTER 2. DIFFERENTIAL FORMS 


3. Let a = Bidz? A dz? + Bədz? A dx! + Badz! A dx? be a 2-form in R. 
Then 


OB, OB Bz 


= 1 2 3 
da = (aa + a2 + qs) ee A dx* ^ dz 
= (V-B)dV. (2.105) 
4. Let a = B,dz', then 
(ads) o = V-B. (2.106) 


5. Let f be a real valued function. Then the Laplacian is given by: 


(xdx)df=V-Vf=V?f. (2.107) 


The Laplacian definition here is consistent with 2.94 because in the case of a 
function f, that is, a O-form, ôf = 0 so Af = ddf. The results above can be 
summarized in terms of short exact sequence called the de Rham complex as 
shown in figure 2.4. The sequence is called exact because successive application 
of the differential operator gives zero. That is, dod = 0. Since there are no 
4-forms in RË, the sequence terminates as shown. If one starts with a function 


N 
0°(R3) a 0!(R3) = N(R?) —2 03(R) 


Fig. 2.4: de Rham Complex in R? 


in 2°(R%), then (do d)f = 0 just says that V x Vf = 0, as in the case of 
conservative vector fields. If instead, one starts with a one form a in 0'(R3), 
corresponding to a vector field A, then (do d)a = 0 says that V-(V x A) =0, 
as in the case of incompressible vector fields. If one starts with a function, but 
instead of applying the differential twice consecutively, one “hops” in between 
with the Hodge operator, the result is the Laplacian of the function. 

If we denote by R a simply connected closed region in Euclidean space 
whose boundary is ôR , then in terms of forms, the fundamental theorem of 
calculus, Stokes’ theorem (See ref 2.81), and the divergence theorem in R? can 
be expressed by a single generalized Stokes’ theorem. 


[e= f : dw. (2.108) 


We find it irresistible to point out that if one defines a complex one-form, 


w = f(z) dz, (2.109) 


where f(z) = u(x, y) + iv(x,y), and where one assumes that u,v are differen- 
tiable with continuous derivatives, then the conditions introduced in equation 


2.4. THE HODGE x OPERATOR 71 


2.96 are equivalent to requiring that dw = 0. In other words, if the form is 
closed, then u and v satisfy the Cauchy-Riemann equations. Stokes’ theorem 
then tells us that in a contractible region with boundary C, the line integral 


[e= | toto 


This is Cauchy’s integral theorem. We should also point out the tantalizing 
resemblance of equations 2.96 to Maxwell’s equations in the section that follows. 


2.4.3 Maxwell Equations 


The classical equations of Maxwell describing electromagnetic phenomena 
are 


V -E = 4rp VxB= 4+ E 
V-B=0 VxE=—%, (2.110) 


where we are using Gaussian units with c = 1. We would like to formulate 
these equations in the language of differential forms. Let z” = (t, xt, a?, x3) be 
local coordinates in Minkowski’s space Mj,3. Define the Maxwell 2-form F by 
the equation 


1 
P= gFuvda" A da”, (u,v = 0,1,2,3), (2.111) 
where 
pao Ses oe 
E, 0 B, -B 


Emz E, —B. 0 OB (2.112) 
E, By —Br 0 
Written in complete detail, Maxwell’s 2-form is given by 
F = —E,dt A da’ — Eydt ^ dz? — E,dt ^A dz? + 
B,dx' ^ dz? — Bydz' ^ dz? + Badz? A dz’. (2.113) 
We also define the source current 1-form 
J = J dx” = pdt + Jidz' + Jodx® + Jzdz?. (2.114) 


2.4.7 Proposition Maxwell’s Equations (2.110) are equivalent to the equa- 
tions 
dF = 0, 
dxF = 4rxJ. (2.115) 


Proof The proof is by direct computation using the definitions of the exterior 
derivative and the Hodge x operator. 


72 CHAPTER 2. DIFFERENTIAL FORMS 


Ez Ey 
ip 2 22 = Ada? A dt ^da’ — OE ey de® pdt N dx! + 
Ox Ox? 
E E 
Mre Ter a LRS E 
Ox1 Ox? 
OE, 1 3 OE; 2 3 
-Jn Adr A dt \ dx? — a2 A dx* A dt A dx? + 
B, B; 
2 ie nde! aR ae = 
Ot Ox? 
B B 
OPa hee are Ce Oi ah eG 
Ot Ox? 
OB, OB, 
A dt A dx? A dz? + — Adz! A dx? A dz. 
Ot Ox! 


Collecting terms and using the antisymmetry of the wedge operator, we get 


r al Ox? X ED 
OE, OF, OB, 
(r Ox? Ot ) 
OE, OE, OB, 
( Ox! = Odx3 Ot 
OE, OE, OB, 
(a Ox} Ot ) 


dF 


dx! A dx? ^ dz? + 


dx? A dt A dx? + 


) dt A dz! A z? + 


dx! ^ dt A dz?. 


Therefore, dF = 0 iff 


OB, OB, By 
+ = 0, 
ôx! Ox? Ox? 
which is the same as 
V-B=0, 


and 


OE, OB. OB, 


Ss ge Op 
OE, OF, OBy _ 0 
Ox! ðr? ot” 
OE, OF, OB, _ 0 
Ox? ôx! ð ”’ 
which means that 3B 
— E- — =0. 2.11 
Vx At 0 ( 6) 


To verify the second set of Maxwell equations, we first compute the dual of the 
current density 1-form (2.114) using the results from example 2.4.1. We get 


xJ = [-pdx' Adz? Adz? + Jydx? Adt Adz? + Jodt \dx' Adz? + J3dx* Adt A dx”). 
(2.117) 


2.4. THE HODGE x OPERATOR 73 


We could now proceed to compute dx F, but perhaps it is more elegant to 
notice that F € 0?(M), and so, according to example (2.4.1), F splits into 
F = F} + F_. In fact, we see from (2.112) that the components of F} are those 
of —E and the components of F_ constitute the magnetic field vector B. Using 
the results of example (2.4.1), we can immediately write the components of xF: 


1 


xF = z Batt A dx + Bydt ^A dx? + B,dt \ dz? + 
E,dz' ^A dx? — Ey,dx' ^ dz? + E,dx* A dz*|, (2.118) 
or equivalently, 
0 ë B, B, 5y 
Fi, = =. B; a E: (2.119) 


-B, E, -Ex 0 


Effectively, the dual operator amounts to exchanging 


E +> -B 
B +> +E, 


in the left hand side of the first set of Maxwell equations. We infer from 
equations (2.116) and (2.117) that 


V- E = 4rp 


and 


Y xB- ÌE aac. 
ot 


Most standard electrodynamic textbooks carry out the computation entirely 
tensor components, To connect with this approach, we should mention that it 
FY” represents the electromagnetic tensor, then the dual tensor is 


y/det g 
2 


Fi, = Cuber (2.120) 


Since dF = 0, in a contractible region there exists a one form A such that 
F = dA. The form A is called the 4-vector potential. The components of A are, 
A= A, dz", 

Ay = (A) (2.121) 


where ¢ is the electric potential and A the magnetic vector potential. The 
components of the electromagnetic tensor F are given by 


OA OA 
Pyp = —_*, 2.122 
S ôx” Ox” ( ) 
The classical electromagnetic Lagrangian is 
1 V 
Lem = Sgi E" + J AL, (2.123) 


74 CHAPTER 2. DIFFERENTIAL FORMS 


with corresponding Euler-Lagrange equations 


ð | OL OL 
=0. (2.124) 
Oxt E oA, 


To carry out the computation we first use the Minkowski to write the Lagrangian 
with the indices down. The key is to keep in mind that A,„,„ are treated as 
independent variables, so the derivatives of Aa,g vanish unless u = a and 
v = B. We get, 
L = (FaF), 
(Anv) 4 (Anv) 
1 OL 


D s Fy Fy ar, Bo 
a. BEN N J; 


1 
= 709°? [Fap (0k 0h — SGX) + Fao (0h03 — 555k); 


II 


II 


1 
= Fog +1 NY? Fro — nnet Fap — nh? (Fro, 


1 
= i [Rey l Per pve FY), 
n a 
On the other hand, 
OL ” 
Sd 
OA, 
Therefore, the field equations are 
O pw H 


The dual equations equivalent to the other pair of Maxwell equations is 


ð 
— x F” =0. 
one 


In the gauge theory formulation of classical electrodynamics, the invariant ex- 
pression for the Lagrangian is the square of the norm of the field F under the 
induced inner product 


< F,F >= - {a xF) dQ. (2.126) 


This the starting point to generalize to non-Abelian gauge theories. 


Chapter 3 


Connections 


3.1 Frames 


This chapter is dedicated to professor Arthur Fischer. In my second year as 
an undergraduate at Berkeley, I took the undergraduate course in differential 
geometry which to this day is still called Math 140. The driving force in my 
career was trying to understand the general theory of relativity, which was only 
available at the graduate level. However, the graduate course (Math 280 at the 
time) read that the only prerequisite was Math 140. So I got emboldened and 
enrolled in the graduate course taught that year by Dr. Fischer. The required 
book for the course was the classic by Adler, Bazin, Schiffer. I loved the book; 
it was definitely within my reach and I began to devour the pages with the great 
satisfaction that I was getting a grasp of the mathematics and the physics. On 
the other hand, I was completely lost in the course. It seemed as if it had 
nothing to do with the material I was learning on my own. Around the third 
week of classes, Dr. Fischer went through a computation with these mysterious 
operators, and upon finishing the computation he said if we were following, he 
had just derived the formula for the Christoffel symbols. Clearly, I was not 
following, they looked nothing like the Christoffel symbols I had learned from 
the book. So, with great embarrassment I went to his office and explained my 
predicament. He smiled, apologized when he did not need to, and invited me to 
1-1 sessions for the rest of the two-semester course. That is how I got through 
the book he was really using, namely Abraham-Marsden. I am forever grateful. 

As noted in Chapter 1, the theory of curves in R can be elegantly for- 
mulated by introducing orthonormal triplets of vectors which we called Frenet 
frames. The Frenet vectors are adapted to the curves in such a manner that the 
rate of change of the frame gives information about the curvature of the curve. 
In this chapter we will study the properties of arbitrary frames and their cor- 
responding rates of change in the direction of the various vectors in the frame. 
These concepts will then be applied later to special frames adapted to surfaces. 


3.1.1 Definition A coordinate frame in R” is an n-tuple of vector fields 
{e1,...,@n} which are linearly independent at each point p in the space. 


75 


76 CHAPTER 3. CONNECTIONS 


In local coordinates {z',...,«"}, we can always express the frame vectors 
as linear combinations of the standard basis vectors 


e=) Ai = 0,4’, (3.1) 


where 0; = aor. Placing the basis vectors 0; on the left is done to be consistent 
with the summation convention, keeping in mind that the differential operators 
do not act on the matrix elements. We assume the matrix A = (A’,) to be 
nonsingular at each point. In linear algebra, this concept is called a change 
of basis, the difference being that in our case, the transformation matrix A 
depends on the position. A frame field is called orthonormal if at each point, 


< ei, ej >= Õij- (3.2) 


Throughout this chapter, we will assume that all frame fields are orthonormal. 
Whereas this restriction is not necessary, it is convenient because it results in 
considerable simplification in computions. 


3.1.2 Proposition If {e1,...,en} is an orthonormal frame, then the trans- 
formation matrix is orthogonal (ie, AAT = T) 
Proof The proof is by direct computation. Let e; = 0;A’,. Then 


dij = < €i, €j >, 
= < ðA}, 0A}; >, 
= AFA! < ôk, ô >, 
= AXA’ Oni, 
= Af Aki, 
= AX(A*) jp. 


Hence 


(AT), AX = ói, 
(AT), A; = ő, 
ATA = I. 


Given a frame {e;}, we can also introduce the corresponding dual coframe 
forms 0’ by requiring that 


6*(e;) = È. 


(3.3) 


Since the dual coframe is a set of 1-forms, they can also be expressed in local 
coordinates as linear combinations 


CSB da". 


3.1. FRAMES 77 


It follows from equation( 3.3), that 


6'(e;) = Biydx*(0,A',), 
= B',A‘,dax* (a1), 
= B’ Aó, 

5, = B'A“. 


Therefore, we conclude that BA = I, so B = AT! = AT. In other words, when 
the frames are orthonormal, we have 

€i a, AX, 

6 = Adz". (3.4) 


3.1.3 Example Consider the transformation from Cartesian to cylindrical 
coordinates: 
x=rcosé, y=rsind, z=2. (3.5) 


Using the chain rule for partial derivatives, we have 


oe cos a + sin AA 
ðr Ox Oy’ 
o o 
a ~ rsin | EOR a) 
De tna ee 
dz Ox 
The vectors 2, and 2 are clearly unit vectors. 


To make the vector a a unit vector, it suffices to divide it by its length r. 


We can then compute the dot products of each pair of vectors and easily verify 
that the quantities 


a= Ê, a=? a= 2, (3.6) 
are a triplet of mutually orthogonal unit vectors and thus constitute an or- 
thonormal frame. The surfaces with constant value for the coordinates r, 0 and 
z respectively, represent a set of mutually orthogonal surfaces at each point. 
The frame vectors at a point are normal to these surfaces as shown in figure 
3.1. Physicists often refer to these frame vectors as {f, 6,2}, or as {e,, eg, ez.}. 


3.1.4 Example For spherical coordinates (2.30) 


x = rsin@cos¢, 


II 


rsin sin ¢, 


= rcos6, 


78 CHAPTER 3. CONNECTIONS 


Fig. 3.1: Cylindrical and Spherical Frames. 


the chain rule leads to 


o ; ð n a, 0 0 

a = sin @ cos pa TORN E + cos 0, 

o o 0 ‘ 

36 ~ r cos 0 cos PA spe ree a Tene oy 
= —rsin sing + rsind cos o>, 


The vector 2 is of unit length but the other two need to be normalized. As 
before, all we need to do is divide the vectors by their magnitude. For 8., we 
divide by r and for F we divide by r sin 0. Taking the dot products of all pairs 
and using basic trigonometric identities, one can verify that we again obtain an 
orthonormal frame. 


o 10 1 ð 

g og I oop 
Furthermore, the frame vectors are normal to triply orthogonal surfaces, which 
in this case are spheres, cones and planes, as shown in figure 3.1. The fact that 
the chain rule in the two situations above leads to orthonormal frames is not 
coincidental. The results are related to the orthogonality of the level surfaces 
x’ = constant. Since the level surfaces are orthogonal whenever they intersect, 
one expects the gradients of the surfaces to also be orthogonal. Transformations 
of this type are called triply orthogonal systems. 


ey = er = 


3.2 Curvilinear Coordinates 


Orthogonal transformations, such as spherical and cylindrical coordinates, 
appear ubiquitously in mathematical physics, because the geometry of many 
problems in this discipline exhibit symmetry with respect to an axis or to the 
origin. In such situations, transformations to the appropriate coordinate sys- 
tem often result in considerable simplification of the field equations involved 
in the problem. It has been shown that the Laplace operator that appears 
in the potential, heat, wave, and Schrödinger field equations, is separable in 


3.2. CURVILINEAR COORDINATES 79 


exactly twelve orthogonal coordinate systems. A simple and efficient method 
to calculate the Laplacian in orthogonal coordinates can be implemented using 
differential forms. 


3.2.1 Example In spherical coordinates the differential of arc length is given 
by (see equation 2.31) the metric: 


ds? = dr? + r*d0? + r° sin? 6d¢”. 


Let 
g = dr, 
= rdé, 
6 = rsin Odo. (3.8) 


Note that these three 1-forms constitute the dual coframe to the orthonormal 

frame derived in equation( 3.7). Consider a scalar field f = f(r,0,¢). We 

now calculate the Laplacian of f in spherical coordinates using the methods of 

section 2.4.2. To do this, we first compute the differential df and express the 
result in terms of the coframe. 

of oF ag ot 

= dr 4 dd, 

df apo D6 o 

OF gı | j Of 1 of 03 

ôr | r 00 rsin ôg ` 


The components df in the coframe represent the gradient in spherical coordi- 
nates. Continuing with the scheme of section 2.4.2, we next apply the Hodge 
x operator. Then, we rewrite the resulting 2-form in terms of wedge products 
of coordinate differentials so that we can apply the definition of the exterior 
derivative. 


xdf = êf 92 ng — =o ag ir o GAG, 


of of 
9p 004 db — rsint z gga rAd@+r ind Od 
af 


30 
Sin 2 sin 0d a dp — sino ol arn dós = aoe 


n0 0d 
dxdf = oor singt P aa ndr ado + 


1 ð ðf 
nd pg gg ^T A a8, 


„ðf, 3, af 
an” Bp) + Bg Sin 8 ag 


of = dr ^ dé, 


= r’sind 


)dr ^ d9 ^ db — “(sind 


1 oy 
sin 6 0¢? 


II 


sino y+ dr \ d0 ^ dọ. 


Finally, rewriting the differentials back in terms of the coframe, we get 


20f Of) 1 Of 
Or 06 sin 0 0¢? 


1 LA 92, 93 
= a AO. 
dx df Zand sin now (r?——) + ? (sind Je AO A 


06 


80 CHAPTER 3. CONNECTIONS 


Therefore, the Laplacian of f is given by 


1 ð |f TEE aO OT 1 8f 
E i ar] geeth + a 


The derivation of the expression for the spherical Laplacian by differential forms 
is elegant and leads naturally to the operator in Sturm-Liouville form. 

The process above can be carried out for general orthogonal transformations. 
A change of coordinates x? = x’(u*) leads to an orthogonal transformation if 
in the new coordinate system u*, the line metric 


ds” = gı (du)? + go2(du?)? + 933(du?)? (3.10) 


(3.9) 


r2 


only has diagonal entries. In this case, we choose the coframe 


0! = gidu! = hidu}, 
6? = Jgzdu? = hadu’, 
63 = J933du3 = hgdu?. 


Classically, the quantities {h1, h2, h3} are called the weights. Please note that, 
in the interest of connecting to classical terminology, we have exchanged two 
indices for one and this will cause small discrepancies with the index summation 
convention. We will revert to using a summation symbol when these discrep- 
ancies occur. To satisfy the duality condition 6’(e;) = 8j, we must choose the 
corresponding frame vectors e; as follows: 


1 ə 1 ð 
1 gn Out hy Out’ 

1 ə 1 ð 
2 Jz OW hz ðu?’ 

1 ə 1 ð 
e3 


= /933 Ou ~ hg Our 


Gradient. Let f = f(x) and x’ = x'(u"). Then 


df = —~ dz", 


As expected, the components of the gradient in the coframe 6’ are the just the 


frame vectors. T 5 , 
1 1 
= 3 .11 
3 (+ Ou! i ho Ou?’ hg xt) (3 ) 


3.3. COVARIANT DERIVATIVE 81 


Curl. Let F = (Fi, Fo, F3) be a classical vector field. Construct the corre- 
sponding 1-form F = F;6' in the coframe. We calculate the curl using the dual 
of the exterior derivative. 


F = F0! + F8? + F6’, 


(hı Fı)du' + (ho F2)du? + (h3 F3)du®, 


_ LOAF)  O(hF); i j 
dF = 5 | aul An du’ A du’, 
= 1 [O(AF); O(hF); i j 
S | a Ful dv A dO’. 
ij |_1 ORF): _ OhF)j | gr k 
— i j = 
xdF = e, E Bul Bul ]| 0° = (V x F)”. 


Thus, the components of the curl are 


1 (2t) O(haPa), 1 jeta) ola) 1 pabati) alhat): 
h2h3- Ou? Our hyhs* Out ðu? hyho* ðu? ðu! ` 


Divergence. As before, let F = F;0t and recall that V - F = xd* F. The 
computation yields 


F = F0! + F0? + F363 
xF = F10? n0? + F0 n0t + EO AG? 
(hah Fi )du? A du? + (hy h3F2)du> A du! + (hyh2F3)du' A du? 


II 


= O(hgh3F‘) O(hyh3F) (hiha F3) 1 2 3 
dxdF = | Jul + Ju? + BNE du’ ^A du* A du’. 
Therefore, 
= cae te Ohgh3F,)  O(hyh3F2)  O(hyh2Fs) 
V.F =xdxF = | Jul + u + Ju? . (3.12) 


hıhzh3 


3.3 Covariant Derivative 


In this section we introduce a generalization of directional derivatives. The 
directional derivative measures the rate of change of a function in the direction 
of a vector. We seek a quantity which measures the rate of change of a vector 
field in the direction of another. 


3.3.1 Definition Given a pair (X,Y) of arbitrary vector field in R”, we 
associate a new vector field VxY, so that Vx : 2(R") — 2(R"). The 
quantity V called a Koszul connection if it satisfies the following properties: 


1. Vsx(Y) 5 ÍN xY, 


2: V(x 4X2) Y = Vx Y + Vx Y, 


82 CHAPTER 3. CONNECTIONS 


3. Vx(¥1 + Yo) = Vx¥it Vx¥o, 
4. VxfY =X(f)Y + fVxy, 


for all vector fields X, X1, X2, Y, Y1, Yo E€ #(R”) and all smooth functions f. 
Implicit in the properties, we set Vx f = X(f). The definition states that the 
map V x is linear on X but behaves as a linear derivation on Y. For this reason, 
the quantity VxY is called the covariant derivative of Y in the direction of X. 


3.3.2 Proposition Let Y = ft a be a vector field in R” , and let X another 


C™ vector field. Then the operator given by 


VxY = X(f’) ia (3.13) 


xt 


defines a Koszul connection. 
Proof The proof just requires verification that the four properties above are 
satisfied, and it is left as an exercise. 

The operator defined in this proposition is the standard connection compat- 
ible with the Euclidean metric. The action of this connection on a vector field 
Y yields a new vector field whose components are the directional derivatives of 
the components of Y. 


3.3.3 Example Let 


o o 

= 7 — —, Y =r? — on 

X vA POR x 3 + ry Dy 

Then, 
Z að o 
= 2) 7 4 2) 
VY = X) tA a 
= Oo Aa Oa) 9 2), 2 
[egg E terg (Ma. tes (ou) +25 Cv) 


o o 
= 20s 2 Dis 
= 2g + (ay* + 2x*yz)—. 


3.3.4 Definition A Koszul connection Vx is compatible with the metric 
g(Y, Z) if = _ a 

Vx < Y,Z >=< VxY,Z>4+<Y,VxZ>. (3.14) 
if F: R” > R” is an isometry so that < FX, FY >=< X,Y >, then it is 
connection preserving in the sense 


F,(VxY) = Vr, x Fy. (3.15) 


In Euclidean space, the components of the standard frame vectors are constant, 
and thus their rates of change in any direction vanish. Let e; be arbitrary frame 
field with dual forms 0’. The covariant derivatives of the frame vectors in the 


3.3. COVARIANT DERIVATIVE 83 


directions of a vector X will in general yield new vectors. The new vectors must 
be linear combinations of the basis vectors as follows: 


Vxel = wt (X)e, + w?1(X)es + w31(X)e3, 
Vxeg = w,(X)e, + w7o(X)eg + w2o(X)es, 
Vxe3 = w3(X)ez + w3(X Jeo + w?3(X)e3. (3.16) 


The coefficients can be more succinctly expressed using the compact index no- 
tation, 
Vxei = ejwi( X). (3.17) 


It follows immediately that 
wi) (X) = 0 (V xei). (3.18) 


Equivalently, one can take the inner product of both sides of equation (3.17) 
with ex to get 


<Vxe,e, > = < ejw! {(X), ex > 
= w;(X) < ej, en > 
wi (X)g yk 
Hence, 
< Vxei, ek >= wei (X) (3.19) 


The left-hand side of the last equation is the inner product of two vectors, 
so the expression represents an array of functions. Consequently, the right- 
hand side also represents an array of functions. In addition, both expressions 
are linear on X, since by definition, V y is linear on X. We conclude that the 
right-hand side can be interpreted as a matrix in which each entry is a 1-forms 
acting on the vector X to yield a function. The matrix valued quantity w$ is 
called the connection form. Sacrificing some inconsistency with the formalism of 
differential forms for the sake of connecting to classical notation, we sometimes 
write the above equation as 


< dei, €k >= Wki, (3.20) 


where {e;} are vector calculus vectors forming an orthonormal basis. 


3.3.5 Definition Let Vx be a Koszul connection and let {e;} be a frame. 
The Christoffel symbols associated with the connection in the given frame are 
the functions i, given by 

Veej = lies (3.21) 


The Christoffel symbols are the coefficients that give the representation of the 
rate of change of the frame vectors in the direction of the frame vectors them- 
selves. Many physicists therefore refer to the Christoffel symbols as the connec- 
tion, resulting in possible confusion. The precise relation between the Christoffel 
symbols and the connection 1-forms is captured by the equations, 


wlej) =T% (3.22) 


84 CHAPTER 3. CONNECTIONS 


or equivalently l 
wi =T 0. (3.23) 


In a general frame in R” there are n? entries in the connection 1-form and n3 
Christoffel symbols. The number of independent components is reduced if one 
assumes that the frame is orthonormal. 
If T = T'e; is a general vector field, then 
Ve T = Ve, (T'e;) 
= T% e; + T'I” jep 
= (T$, + TT’ ei, (3.24) 


which is denoted classically as the covariant derivative 
i mi i mpk 
lj = T3 +T’ ge (3.25) 


Here, the comma in the subscript means regular derivative. The equation above 
is also commonly written as 


Ve T’ SVS Th +I’ T", 


We should point out the accepted but inconsistent use of terminology. What is 
meant by the notation V;7" above is not the covariant derivative of the vector 
but the tensor components of the covariant derivative of the vector; one more 
reminder that most physicists conflate a tensor with its components. 


3.3.6 Proposition Let {e;} be an orthonormal frame and Vx be a Koszul 
connection compatible with the metric . Then 


Wji = —Wij (3.26) 
Proof Since it is given that < e;, ej >= ĝij, we have 


0 = Vx < ĉi, €j >, 
K Vxei, ej PREL ei, V xej >, 
= < wren, ej >+< ei, wter Sy 


k k 

= Wi eK, eC] > TW; < ĉi, €k >, 
k k 

= WiGkj FW jik, 


= Wii + Wij. 


thus proving that w is indeed antisymmetric. 
The covariant derivative can be extended to the full tensor field Z? (R”) by 
requiring that 

a) Vx: II (R?) > Fy (R”), 

b) Vx(Tı 8 T2) = Vê Ih +T ® VxTp, 

c) Vx commutes with all contractions, Vx (CT) = C(Vx). 


3.3. COVARIANT DERIVATIVE 85 


Let us compute the covariant derivative of a one-form w with respect to vector 
field X. The contraction of w @ Y is the function iyw = w(Y). Taking the 
covariant derivative, we have, 


Hence, the coordinate-free formula for the covariant derivative of one-form is, 


(Vxw)(Y) = X(w(Y)) — w(VxY). (3.27) 
Let 6° be the dual forms to e;. We have 
Vxl ej) = Vx0' ej +0 @ Vxe;. 


The contraction of i,,6° = 6’(e;) = 6%, Hence, taking the contraction of the 
equation above, we see that the left-hand side becomes 0, and we conclude 
that, 

(Vx6)(e;) = —0' (V xej). (3.28) 


Let w = T;0°. Then, 
(Vxw)(e;) = (Vx(Zi6"))(e;), 
T,)0" (ej) + Ti(V x0 )(e;), 
If we now set X = ex, we get, 
(Ve,w) (i) = Tje — Ti0' (T kjer), 
= Tjk — TiðiT' kj, 
= Tje — I“ pet. 
Classically, we write 
Val; = Tie = Tj — Tn Ti- (3.30) 
In general, let T be a tensor of type ("), 
PST ire O es D0 R... 0. (3.31) 


Since we know how to take the covariant derivative of a function, a vector, 
and a one form, we can use Leibnitz rule for tensor products and property of 
the covariant derivative commuting with contractions, to get by induction, a 
formula for the covariant derivative of an (")-tensor, 


(VxT)(9", erate -Ejs ) = X(T(6", sO ej < €js)) 
= TV x0, nc, Os Ejs) eee eS T(O, ca Vx", ej, uiy èg) 


_ TO. 0, V xej; Ejs) eee T(0*, 1” ej, a, VX Ejs )- 
(3.32) 


86 CHAPTER 3. CONNECTIONS 


The covariant derivative picks up a term with a positive Christoffel symbol 
factor for each contravariant index and a term with a negative Christoffel sym- 
bol factor for each covariant index. Thus, for example, for a Gh tensor, be 
components of the covariant derivative in classical notation are 


Vil" je =" jaye = T je + ial je — TT ne — aTh. (3.33) 
In particular, if g is the metric tensor and X,Y, Z vector fields, we have 
(Vxg)¥, Z) = X(g(X,Y)) — g(VxY, Z) — g(X,VxZ). 
Thus, if we impose the condition Vxg = 0, the equation above reads 
Vx <Y,Z>=< VxY,Z>4+<Y,VxZ>. (3.34) 


In other words, a connection is compatible with the metric just means that the 
metric is covariantly constant along any vector field. 


In an orthonormal frame in R” the number of independent coefficients of the 
connection 1-form is (1/2)n(n — 1) since by antisymmetry, the diagonal entries 
are zero, and one only needs to count the number of entries in the upper tri- 
angular part of the n x n matrix wij. Similarly, the number of independent 
Christoffel symbols gets reduced to (1/2)n?(n — 1). Raising one index with 
g”, we find that wt; is also antisymmetric, so in R? the connection equations 
become 


— 0 wta(X) 
V x [e1, €2, €3] = [e1, €2,e3] | —wt2(X) 0 
—w!3(X) —w?3 (X) 0 


w3(X) 
w23(X) (3.35) 


Comparing the Frenet frame equation (1.39), we notice the obvious similarity 
to the general frame equations above. Clearly, the Frenet frame is a special case 
in which the basis vectors have been adapted to a curve, resulting in a simpler 
connection in which some of the coefficients vanish. A further simplification 
occurs in the Frenet frame, since in this case the equations represent the rate 
of change of the frame only along the direction of the curve rather than an 
arbitrary direction vector X. To elaborate on this transition from classical to 
modern notation, consider a unit speed curve (s). Then, as we discussed in 


section 1.15, we associate with the classical tangent vector T = dx the vector 
field T = 8'(s) = 4 %. Let W = W(G(s)) = w)(s) 52 be an arbitrary vector 
field constrained to the curve. The rate of change of W along the curve is given 


3.4. CARTAN EQUATIONS 87 


by 
an ee T E 
Caa) 08" 
dx’ — oO 
= — os 
ds a we Dat) 
_ dæ dw) ð 
~ ds dx* Oxi 
E dwi O 
~ ds ðxi 
= W' (s). 


3.4 Cartan Equations 


Perhaps, the most important contribution to the development of modern 
differential geometry, is the work of Cartan, culminating into the famous equa- 
tions of structure discussed in this chapter. 


First Structure Equation 


3.4.1 Theorem Let {e;} be a frame with connection w’; and dual coframe 
6°. Then 


O’ = dof + wt; A067 = 0. (3.36) 


Proof Let 
€i = 0; A); 


be a frame, and let 6’ be the corresponding coframe. Since 6’(e;), we have 
gi = (Aided, 


Let X be an arbitrary vector field. Then 


Vxei = Vx(0;A%). 
eju (X) = 0;X(A%), 
= 0,d(A‘,)(X), 
= e,(A-')*d(A4)(X). 
wk(X) = (AEA). 


Hence, 
wi = (A7 dA), 


or, in matrix notation, 


w = A'dA. (3.37) 


88 CHAPTER 3. CONNECTIONS 


On the other hand, taking the exterior derivative of 0’, we find that 
dð = d(A~*)*, A da’, 
= dA} A AO", 
dð = d( AANO. 
However, since AT!A = I, we have d(A~')A = —A~!dA = —w, hence 
dd = —w ^ 8. (3.38) 


In other words l 
d0’ +w’ A0 =0. 


3.4.2 Example SO(2,R) 
Consider the polar coordinates part of the transformation in equation 3.5. 
Then the frame equations 3.6 in matrix form are given by: 


cos —sin0 
[ene] = [2.3 : (3.39) 
sin  cos@ 


Thus, the attitude matrix 


cos —sind 
A= (3.40) 
sin  cos@ 


is a rotation matrix in R?. The set of all such matrices forms a continuous 
group ( Lie group) called SO(2, R). In such cases, the matrix 


w=A "dA (3.41) 


in equation 3.37 is called the Maurer-Cartan form of the group. An easy com- 
putation shows that for the rotation group S'O(2), the connection form is 


0 —d6 
w= (3.42) 
dð 0 


Second Structure Equation 


Let 6° be a coframe in R” with connection w';. Taking the exterior derivative 


of the first equation of structure and recalling the properties (2.66), we get 


d(d6") + d(w’, A 6") =0, 
dus’, AO — w, A do) =0. 


3.4. CARTAN EQUATIONS 89 


Substituting recursively from the first equation of structure, we get 
dus’, N09 — wi, A (=w, A O*) = 0, 
du’, AO + wt, A wk A =0, 
(dus, en Ae =0, 


3.4.3 Definition The curvature Q of a connection w is the matrix valued 
2-form, 
igi i k 


3.4.4 Theorem Let @ be a coframe with connection w in R” . Then the 
curvature form vanishes: 
Q = dw+w ^w =Q. (3.44) 


Proof Given that there is a non-singular matrix A such that 0 = A~'dx and 
w = ATtdA, we have 
dw = d(A~') A dA. 


On the other hand, 


wAw = (A ‘dA)A(A7‘dA), 


= -d(A)AA A“dA, 
= -d(A~*)(AA™*) AdA, 
= —d(A-!)AdA. 


Therefore, dw = —w ^A w. 

There is a slight abuse of the wedge notation here. The connection w is ma- 
trix valued, so the symbol w A^ w is really a composite of matrix and wedge 
multiplication. 


3.4.5 Example Sphere frame 
The frame for spherical coordinates 3.7 in matrix form is 


sin coso cosé@cos@ —sing 
er, €09, €ġ] = B dy? 2 sindsing cosésing coso 
cos 0 — sind 0 


Hence, 
sinf coso sin@sing cosé 
AT! = |cos@cosd cos@sing -—sinð|, 
—sing cos @ 0 


90 CHAPTER 3. CONNECTIONS 


and 


cos 0 cos ġ d0 — sin Osin o dọ — sin 0 cos ọ d0 — cos O sinodo —cosddd 
dA = |cos@sin¢dd#+sin@cos¢dd — sinsin d0 + cos cosd —sinddd| . 
— sin 0 d0 — cos 0 d0 0 


Since the w = A~!dA is antisymmetric, it suffices to compute: 


w, = |- sin? 6 cos” ¢ — sin? O sin? @ — cos” 6] dO 
+ [sin 8 cos 0 cos ġ sin ¢ — sin 6 cos 6 cos ¢ sin ¢] dọ, 
= —dé, 
wh = |- sin 9 cos? ¢— sin@ sin? 4] dé = — sin 0 dẹ, 
wh = [— cos 6 cos? ¢ — cos 0 sin? ¢| dé = — cos 0 dẹ. 


We conclude that the matrix-valued connection one form is 


0 —d0 — sin 0 do 
w = dé 0 — cos 0 do 
sinf dọ cosl do 0 


A slicker computation of the connection form can be obtained by a method of 
educated guessing working directly from the structure equations. We have that 
the dual one forms are: 


6! = dr, 
6? =r dé, 
6° = rsin 8 dọ. 
Then 
dé? = —d ^ dr, 


= —w AO — ws A8. 


So, on a first iteration we guess that wł = d. The component w%, is not nec- 
essarily 0 because it might contain terms with dø. Proceeding in this manner, 
we compute: 


d0? = sin 0 dr A dọ + r cos 0 d0 A dọ, 
= — sin dọ A dr — cos 0 do A r dé, 
= —w} Adr A0! — w3 Ae. 


Now we guess that wł = sin 0 dọ, and w3, = cos 0 dọ. Finally, we insert these 
into the full structure equations and check to see if any modifications need to be 
made. In this case, the forms we have found are completely compatible with the 
first equation of structure, so these must be the forms. The second equations 
of structure are much more straight-forward to verify. For example 


3.4. CARTAN EQUATIONS 91 


dw, = d(— cos 0 dé), 
= sin 6 dé A dọ, 
= —d0 A (—sin dé), 


Zane 1 
= —wĵ4 A W3. 


Change of Basis 


We briefly explore the behavior of the quantities ©’ and Qf; under a change 
of basis. Let e; be frame in M = R” with dual forms 0’, and let €; be another 
frame related to the first frame by an invertible transformation. 


Ei = e; BÍ, (3.45) 
which we will write in matrix notation as € = eB. Referring back to the 
definition of connections (3.17), we introduce the covariant differential V which 
maps vectors into vector-valued forms, 


V: Q0(M, TM) > 01(M,TM) 


given by the formula 


Ve; = ej ® w, 
= ejui 
Ve = ew (3.46) 


where, once again, we have simplified the equation by using matrix notation. 
This definition is elegant because it does not explicitly show the dependence on 
X in the connection (3.17). The idea of switching from derivatives to differen- 
tials is familiar from basic calculus. Consistent with equation 3.20, the vector 
calculus notation for equation 3.46 would be 


dei = ej wi. (3.47) 


However, we point out that in the present context, the situation is much more 
subtle. The operator V here maps a vector field to a matrix-valued tensor of 
rank (£). Another way to view the covariant differential is to think of V as an 
operator such that if e is a frame, and X a vector field, then Ve(X) = V xe. If f 
is a function, then V f(X) = Vx f = df (X), so that V f = df. In other words, V 
behaves like a covariant derivative on vectors, but like a differential on functions. 
The action of the covariant differential also extends to the entire tensor algebra, 
but we do not need that formalism for now, and we delay discussion to section 
6.4 on connections on vector bundles. Taking the exterior differential of (3.45) 


92 CHAPTER 3. CONNECTIONS 


and using (3.46) recursively, we get 


Ve = (Ve)B+e(dB) 
ewB + e(dB) 

= @¢B-'wB+eB ‘dB 
é[B-'wB + B-'dB] 


= eW 


provided that the connection w in the new frame € is related to the connection 
w by the transformation law, (See 6.62) 


o=B wB+B dB. (3.48) 


It should be noted than if e is the standard frame e; = ô; in R” , then Ve = 0, so 
that w = 0. In this case, the formula above reduces to W = B~'dB, showing that 
the transformation rule is consistent with equation (3.37). The transformation 
law for the curvature forms is, 


Q = BOB. (3.49) 


A quantity transforming as in 3.49 is said to be a tensorial form of adjoint type. 


3.4.6 Example Suppose that B is a change of basis consisting of a rotation 
by an angle @ about e3. The transformation is a an isometry that can be 
represented by the orthogonal rotation matrix 


cos? —sin@é 0 
B= |sin0 cos@ 0J. (3.50) 
0 0 1 


Carrying out the computation for the change of basis 3.48, we find: 


Tta = wla — dé, 
W! = cos 0 w'3 + sind ws, 
W3 = — sin 0 w'3 + cos 0 ws. (3.51) 


The B-'dB part of the transformation only affects the w! term, and the effect 
is just adding d0 much like the case of the Maurer-Cartan form for SO(2) above. 


Chapter 4 


Theory of Surfaces 


4.1 Manifolds 


4.1.1 Definition A coordinate chart or coordinate patch in M C R? is a 
differentiable map x from an open subset V of R? onto a set U C M. 


x:VcCR? — R? 
(u,v) = (z(u,v), y(u, v), z(u, v)) (4.1) 


Each set U = x(V) is called a coordinate neighborhood of M. We require that 


x(u, v) 


Fig. 4.1: Surface 


the Jacobian of the map has maximal rank. In local coordinates, a coordinate 
chart is represented by three equations in two variables: 


x = fi(u®”), where i= 1,2,3, a= 1,2. (4.2) 


It will be convenient to use the tensor index formalism when appropriate, so 
that we can continue to take advantage of the Einstein summation convention. 
The assumption that the Jacobian J = (Ox'/Ou%) be of maximal rank allows 
one to invoke the implicit function theorem. Thus, in principle, one can locally 


93 


94 CHAPTER 4. THEORY OF SURFACES 


solve for one of the coordinates, say x?, in terms of the other two, to get an 
explicit function 
r? = fle x°). (4.3) 


The loci of points in R® satisfying the equations zê = ft (u®) can also be locally 
represented implicitly by an expression of the form 


Fe» x°, £?) = 0. (4.4) 


4.1.2 Definition Let U; and U; be two coordinate neighborhoods of a point 
p € M with corresponding charts x(ut, u?) : Vi — U; C R? and y(v!, v?) : 
V; — U; C R° with a non-empty intersection U;NU; 4 Ø. On the overlaps, the 
maps ij = x` ty are called transition functions or coordinate transformations. 
(See figure 4.2 ) 


xly 
in 
Fig. 4.2: Coordinate Charts 


4.1.3 Definition A differentiable manifold of dimension 2, is a space M 
together with an indexed collection {U,}aer of coordinate neighborhoods sat- 
isfying the following properties: 


1. The neighborhoods {Ua} constitute an open cover M. That is, if p € M, 
then p belongs to some chart. 


2. For any pair of coordinate neighborhoods U; and U; with U; N U; 4 0, 
the transition maps ¢,; and their inverses are differentiable. 


3. An indexed collection satisfying the conditions above is called an atlas. 
We require the atlas to be maximal in the sense that it contains all possible 
coordinate neighborhoods. 


The overlapping coordinate patches represent different parametrizations for the 
same set of points in RÌ. Part (2) of the definition insures that on the overlap, 
the coordinate transformations are invertible. Part (3) is included for technical 
reasons, although in practice the condition is superfluous. A family of coordi- 
nate neighborhoods satisfying conditions (1) and (2) can always be extended to 


4.1. MANIFOLDS 95 


a maximal atlas. This can be shown from the fact that M inherits a subspace 
topology consisting of open sets which are defined by the intersection of open 
sets in R? with M. 

If the coordinate patches in the definition map from R” to R™ n < m we 
say that M is a n-dimensional submanifold embedded in R”. In fact, one could 
define an abstract manifold without the reference to the embedding space by 
starting with a topological space M that is locally Euclidean via homeomorphic 
coordinate patches and has a differentiable structure as in the definition above. 
However, it turns out that any differentiable manifold of dimension n can be 
embedded in R2”, as proved by Whitney in a theorem that is beyond the scope 
of these notes. 

A 2-dimensional manifold embedded in R? in which the transition func- 
tions are C'™, is called a smooth surface. The first condition in the definition 
states that each coordinate neighborhood looks locally like a subset of R?. The 
second differentiability condition indicates that the patches are joined together 
smoothly as some sort of quilt. We summarize this notion by saying that a 
manifold is a space that is locally Euclidean and has a differentiable structure, 
so that the notion of differentiation makes sense. Of course, R” is itself an n 
dimensional manifold. 

The smoothness condition on the coordinate component functions xê (u®) 
implies that at any point x’(ug + h®) near a point x’(ug) = x*(uo, vo), the 
functions admit a Taylor expansion 


i ; Ox* 1 02 x7 
i Q a $ Q fi Q | apb 
x (ug +h”) =x (ug) +h (=), f z” h (soap) t (4.5) 


Since the parameters u“ must enter independently, the Jacobian matrix 


must have maximal rank. At points where J has rank 0 or 1, there is a singu- 
larity in the coordinate patch. 


4.1.4 Example Consider the local coordinate chart for the unit sphere ob- 
tained by setting r = 1 in the equations for spherical coordinates 2.30 


x(0, p) = (sin 6 cos ¢, sin @ sin ¢, cos 8). 
The vector equation is equivalent to three scalar functions in two variables: 


= sinl coso, 


II 


sin 6 sin ¢, 
z = cosġ. (4.6) 


Clearly, the surface represented by this chart is part of the sphere x? +y? +z? = 
1. The chart cannot possibly represent the whole sphere because, although 


96 CHAPTER 4. THEORY OF SURFACES 


a sphere is locally Euclidean, (the earth is locally flat) there is certainly a 
topological difference between a sphere and a plane. Indeed, if one analyzes the 
coordinate chart carefully, one will note that at the North pole (6 = 0, z = 1, 
the coordinates become singular. This happens because 0 = 0 implies that 
x = y = 0 regardless of the value of ¢, so that the North pole has an infinite 
number of labels. In this coordinate patch, the Jacobian at the North Pole does 
not have maximal rank. To cover the entire sphere, one would need at least 
two coordinate patches. In fact, introducing an exactly analogous patch y(u.v) 
based on South pole would suffice, as long as in overlap around the equator 
functions x~!y, and y~!x are smooth. One could conceive more elaborate 
coordinate patches such as those used in baseball and soccer balls. 

The fact that it is required to have two parameters to describe a patch on 
a surface in R is a manifestation of the 2-dimensional nature of the surfaces. 
If one holds one of the parameters constant while varying the other, then the 
resulting 1-parameter equation describes a curve on the surface. Thus, for ex- 
ample, letting ¢ = constant in equation (4.6), we get the equation of a meridian 
great circle. 


4.1.5 Example Surface of revolution 
Given a function f(r), the coordinate chart 


x(r,) = (rcos¢,rsin ¢, f(r)) (4.7) 


represents a surface of revolution around the z- 
axis in which the cross section profile has the 
shape of the function. Horizontal cross-sections 
are circles of radius r. In fi ure 4.3, we have cho- Fig. 4.3: Bell 

sen the function f(r) =e7™ to be a Gaussian, so 

the surface of revolution is bell-shaped. A lateral curve profile for ¢ = 7/4 is 
shown in black. We should point out that this parametrization of surfaces of 
revolution is fairly constraining because of the requirement of z = f(r) to bea 
function. Thus, for instance, the parametrization will not work for surfaces of 
revolution generated by closed curves. In the next example, we illustrate how 
one easily get around this constraint. 


4.1.6 Example Torus 

Consider the surface of revolution generated by rotating a circle C of radius r 
around a parallel axis located a distance R from its center as shown in figure 
4.4. 

The resulting surface called a torus can be parametrized by the coordinate patch 


x(u, v) = ((R + r cosu) cos v, (R + r cos u) sin v, r sin u). (4.8) 


Here the angle u traces points around the z-axis, whereas the angle v traces 
points around the circle C. (At the risk of some confusion in notation, (the 
parameters in the figure are bold-faced; this is done solely for the purpose 
of visibility.) The projection of a point in the surface of the torus onto the 
xy-plane is located at a distance (R + r cosu) from the origin. Thus, the x 


4.2. THE FIRST FUNDAMENTAL FORM 97 


Fig. 4.4: Torus 


and y coordinates of the point in the torus are just the polar coordinates of 
the projection of the point in the plane. The z-coordinate corresponds to the 
height of a right triangle with radius r and opposite angle u. 


4.1.7 Example Monge patch 

Surfaces in R3 are first introduced in vector calculus by a function of two 
variables z = f(x,y). We will find it useful for consistency to use the obvious 
parametrization called an Monge patch 


x(u, v) = (u,v, f(u, v)). (4.9) 


4.1.8 Notation Given a parametrization of a surface in a local chart x(u, v) = 


x(ut, u?) = x(u®), we will denote the partial derivatives by any of the following 


notations: 


Ox 07x 
Xu = X1 = =— X = X11 => 27 
Ox O?x 
Ae = ai Seine ie 
or more succinctly, 
Ox O?x 


(4.10) 


Xa = Bye? “P = yeah 


4.2 The First Fundamental Form 


Let x*(u%) be a local parametrization of a surface. Then, the Euclidean 
inner product in R? induces an inner product in the space of tangent vectors 


98 CHAPTER 4. THEORY OF SURFACES 


at each point in the surface. This metric on the surface is obtained as follows: 


i Ox" a 
dr = Dua du“, 
d? = õijdædat, 
Ox ðxİ 
I Bue duf 


= 06 du“du’. 
Thus, 
ds? = gagdu“du’, (4.11) 
where I 
ðr’ Ox’ 
ab = fij =— =: 4.12 
Jab 6 J bu dub ( ) 
We conclude that the surface, by virtue of being embedded in R3, inherits 
a natural metric (4.11) which we will call the induced metric. A pair {M, g}, 
where M is a manifold and g = gagdu* @ du? is a metric is called a Riemannian 
manifold if considered as an entity in itself, and a Riemannian submanifold 
of R” if viewed as an object embedded in Euclidean space. An equivalent 
version of the metric (4.11) can be obtained by using a more traditional calculus 
notation: 


dx = x,du+x,dv 
d? = dx-dx 
= (x,du+x,dv) - (x,du + x dv) 
= (Xy+X,)du? + 2(x, ` xX,)dudu + (xy + x,)dv?. 


We can rewrite the last result as 


ds? = Edu? + 2Fdudv + Gdv’, (4.13) 
where 
E = Gir = Xu ' Xu 
F = g2 = Xu° Xv 
= G21 = Xv ` Xu 
G = 922 = Xv ` Xv. 
That is 


Jab = Xa ' Xp =< Xa, X8 >. 


4.2.1 Definition First fundamental form 
The element of arc length, 


ds? = gagdu® Q du®, 4.14 
B 


is also called the first fundamental form. 


4.2. THE FIRST FUNDAMENTAL FORM 99 


We must caution the reader that this quantity is not a form in the sense 
of differential geometry since ds? involves the symmetric tensor product rather 
than the wedge product. The first fundamental form plays such a crucial role in 
the theory of surfaces that we will find it convenient to introduce a more modern 
version. Following the same development as in the theory of curves, consider 
a surface M defined locally by a function q = (ut, u?) œ— p = a(ul,u?). We 
say that a quantity Xp is a tangent vector at a point p E€ M if X, is a linear 
derivation on the space of C% real-valued functions F = {f|f : M — Ry} on 
the surface. The set of all tangent vectors at a point p € M is called the tangent 
space T,M. As before, a vector field X on the surface is a smooth choice of a 
tangent vector at each point on the surface and the union of all tangent spaces 
is called the tangent bundle TM. Sections of the tangent bundle of M are 
consistently denoted by Z (M). The coordinate chart map a: R? — M C 
R? induces a push-forward map a, : TR? — TM which maps a vector V at 
each point in T,(R?) into a vector Vag) = (Vy) in Taça) M, as illustrated in 
the diagram 4.5 


a 


V €e TR ——— TM 


| | 


qeR?-*;,mMeR?_*>R 


Fig. 4.5: Push-Forward 


The action of the push-forward is defined by 


(VIF) laay= Vf 2 a) la - (4.15) 


Just as in the case of curves, when we revert back to classical notation to 
describe a surface as x*(u%), what we really mean is (xê o a)(u%), where z’ are 
the coordinate functions in R3 . Particular examples of tangent vectors on M 
are given by the push-forward of the standard basis of TR?. These tangent 
vectors which earlier we called x, are defined by 


o o 
ala) A) la(ue)= Ra f°) Jue : 


In this formalism, the first fundamental form I is just the symmetric bilinear 
tensor defined by the induced metric, 


I(X,Y) =9(X,Y) =< X,Y >, (4.16) 


where X and Y are any pair of vector fields in X (M). 


Orthogonal Parametric Curves 


Let V and W be vectors tangent to a surface M defined locally by a chart 
x(u"). Since the vectors Xa span the tangent space of M at each point, the 


100 CHAPTER 4. THEORY OF SURFACES 


vectors V and W can be written as linear combinations, 


V = V°xa, 
W = W°x,. 
The functions V“ and W® are the curvilinear components of the vectors. We 


can calculate the length and the inner product of the vectors using the induced 
Riemannian metric as follows: 


IVI? = < V,V >=< V°xa, Vo xg >= VOM < Xa, Xg >, 
IVI? = gap V®V®, 
[w]? Jag W° W®, 
and 
<V,W > = <V%x,,Wxg >= V°W’ < xa, Xg >, 
= Jag V“W®. 


The angle 0 subtended by the vectors V and W is the given by the equation 
<V,W > 
IVI wI 
I(V, W) 
VIV, V) VIW, w) 
Jap V wh 
V Garbo Vere V/Jasbs Was Wes’ 


cos = 


(4.17) 


where the numerical subscripts are needed for the œ and ( indices to comply 
with Einstein’s summation convention. 
Let u“ = ° (t) and u® = w(t) be two curves on the surface. Then the 
total differentials dee dye 
du“ = ——dt, and ĝu“ = —— ôt 
dt dt 
represent infinitesimal tangent vectors (1.23) to the curves. Thus, the angle 
between two infinitesimal vectors tangent to two intersecting curves on the 
surface satisfies the equation: 


Ja By du“ uf! 


cos ĝ = p 
V/ Jers By du”? duf? 4/ Jos 6 U23 dus 


(4.18) 


In particular, if the two curves happen to be the parametric curves, u! = const. 
and u? =const., then along one curve we have du! = 0, with du? arbitrary, and 
along the second ĝu! is arbitrary and du? = 0. In this case, the cosine of the 
angle subtended by the infinitesimal tangent vectors reduces to: 


gigi: ge SUS (4.19) 
Voi (dul)? V gz(du?)? gug VEG i 


cos = 


4.2. THE FIRST FUNDAMENTAL FORM 101 


A simpler way to obtain this result is to recall that parametric directions are 
given by x, and x,y, so 


<Xu,Xv > _ F 
I|Xul] lxo VEG 


It follows immediately from the equation above that: 


cos 0 = 


(4.20) 


4.2.2 Proposition The parametric curves are orthogonal if F = 0. 

Orthogonal parametric curves are an important class of curves, because 
locally the coordinate grid on the surface is similar to coordinate grids in basic 
calculus, such as in polar coordinates for which ds? = dr? + r?d0?. 


4.2.3. Examples a) Sphere 


x = (asin@cos¢,asin@sin ¢, acos 6), 
Xo = (acos@cos¢,acos@sin ¢,—asin98), 
x, = (—asin@sin ¢, asin 6 cos ¢,0), 

E = xg:xg= a’, 

F = xg-xg3=0, 

G = xy =o sir 0, 
ds? = a?d6? +a? sin? 0 do’. (4.21) 


a) Loxodromes b) Escher Drawing c) Aloe Polyphylla 


There are many interesting curves on a sphere, but amongst these the loz- 
odromes have a special role in history. A loxodrome is a curve that winds 
around a sphere making a constant angle with the meridians. In this sense, 
it is the spherical analog of a cylindrical helix and as such it is often called a 
spherical helix. The curves were significant in early navigation where they are 
referred as rhumb lines. As people in the late 1400’s began to rediscover that 
earth was not flat, cartographers figured out methods to render maps on flat 
paper surfaces. One such technique is called the Mercator projection which is 
obtained by projecting the sphere onto a plane that wraps around the sphere 
as a cylinder tangential to the sphere along the equator. 

As we will discuss in more detail later, a navigator travelling a constant 
bearing would be moving on a straight path on the Mercator projection map, 


102 CHAPTER 4. THEORY OF SURFACES 


but on the sphere it would be spiraling ever faster as one approached the poles. 
Thus, it became important to understand the nature of such paths. It appears 
as if the first quantitative treatise of loxodromes was carried in the mid 1500’s 
by the portuguese applied mathematician Pedro Nunes, who was chair of the 
department at the University of Coimbra. 

As an application, we will derive the equations of loxodromes and compute 
the arc length. A general spherical curve can be parametrized in the form 
y(t) = x(0(t), d(t)). Let o be the angle the curve makes with the meridians 
@ = constant. Then, recalling that < x, x >= F = 0, we have: 


dt dt’ 
coso = See = En EP 
IIxell-lly’ll vE% ds 


dt 
a?d6? = co? o ds”, 
a? sin? o d8? = a? cos? o sin? 0 dd’, 
sing dd = + coso sin 0 dọ, 
csc 0 dO = + cota do. 


The convention used by cartographers, is to measure the angle 0 from the equa- 
tor. To better adhere to the history, but at the same time avoiding confusion, we 
replace 0 with ð = 5 — 0, so that 3 = 0 corresponds to the equator. Integrating 
the last equation with this change, we get 


sec V dd = + cot o do 
In tan($ + 2) = + cot o(¢ — ġo). 


Thus, we conclude that the equations of loxodromes and their arc lengths are 
given by 


$ = +(tano)lntan( + 4) + ġo (4.22) 
s = a(0 — 09) seca, (4.23) 


where 0o and ġo are the coordinates of the initial position. Figure 4.2 shows 
four loxodromes equally distributed around the sphere. 

Loxodromes were the bases for a number of beautiful drawings and woodcuts 
by M. C. Escher. figure 4.2 also shows one more beautiful manifestation of 
geometry in nature in a plant called Aloe Polyphylla. Not surprisingly, the 
plant has 5 loxodromoes which is a Fibonacci number. We will show later un- 
der the discussion of conformal (angle preserving) maps in section 5.2.2, that 
loxodromes map into straight lines making a constant angle with meridians in 
the Mercator projection (See Figure 5.9). 


b) Surface of Revolution 


4.2. THE FIRST FUNDAMENTAL FORM 103 


Xr 


Xo 


ds? 


= (rcos6,rsin9, f(r)), 

= (cos6,sin9, f’(r)), 

= (-rsin6@,rcos6,0), 

= xr:xr=1+ 77), 

= X,:x9=0, 

= xox =r’, 

= [1+ f?(r)]dr? + 17d6?. 


As in figure 4.6, we have chosen a Gaussian profile to illustrate a surface of 
revolution. Since F = 0 the parametric lines are orthogonal. The picture shows 
that this is indeed the case. At any point of the surface, the analogs of meridi- 
ans and parallels intersect at right angles. 


Fig. 4.6: Surface of Revolution and Pseudosphere 


c) Pseudosphere 


Quy x 
II 


N 


dsf = 


(asin u cos v, asin usin v, a(cos u + In(tan z) 
a? cot? U, 

=0 

a’ sin? u, 


a’ cot? u du? + a? sin? u dv’. 


The pseudosphere is a surface of revolution in which the profile curve is a trac- 
trix. The tractrix curve was originated by a problem posed by Leibnitz to the 
effect of finding the path traced by a point initially placed on the horizontal 
axis at a distance a from the origin, as it was pulled along the vertical axis by a 
taught string of constant length a, as shown in figure 4.6. The tractrix was later 
studied by Huygens in 1692. Colloquially this is the path of a reluctant dog 
at (a,0) dragged by a man walking up the z-axis. The tangent segment is the 
hypothenuse of a right triangle with base x and height Va? — x2, so the slope 


104 


is dz/dx = — va? — x?/x. Using the trigonometric substitution x = asin u, we 
get z =a f (cos? u/sinu) du, which leads to the appropriate form for the profile 
of the surface of revolution. The pseudosphere was studied by Beltrami in 1868. 
He discovered that in spite of the surface extending asymptotically to infinity, 
the surface area is finite with S = 47a? as in a sphere of the same radius, and 
the volume enclosed is half that sphere. We will have much more to say about 


CHAPTER 4. THEORY OF SURFACES 


this surface. 


ds? 


WD 


d) Torus e) Helicoid f) Catenoid 


Fig. 4.7: Examples of Surfaces 


(b + acosu) cosv, (b+ acosu)sinv,asinu) (See 4.8), 


2 
’ 


a 
0, 
(b+ acosu)’, 

a*du? + (b + acosu)*dv?. (4.24) 


e) Helicoid 


x 
E = 
F = 
G 


d? = 


WD 


(ucosv,usinv,av) Coordinate curves u = c. are helices. 
1, 

0, 

u + a°, 

du? + (u? + a”)dv?. (4.25) 


f) Catenoid 


ds? = 


; —1 U fe $ 
(ucosv,usinv,ccosh”” —), This is a catenary of revolution. 
c 


u2 


u2 — e2’ 


du? + u’ do’, (4.26) 


4.2. THE FIRST FUNDAMENTAL FORM 105 


g) Cone and Conical Helix 
The equation z? = cot? a(x? + y?), represents a circular cone whose generator 
makes an angle a with the z-axis. In parametric form, 


x = (rcosd¢,rsing,r cota), 
E = csa, 
F = 0, 
G = r°, 
ds? = csc?a dr? +r°do?. (4.27) 


A conical helix is a curve y(t) = x(r(t), ọ(t)), that makes a constant angle o 
with the generators of the cone. Similar to the case of loxodromes, we have 


i. dr do 

r eae 
coso = eerie = Eg = w 
xl lri VES ds 


dt 
E dr? = cos’ o ds?, 
esc? a dr? = cos” o (csc? a dr? + r7d¢”), 
esc? asin? o dr? = r? cos? o dd”, 


1 
— dr = cot o sin a dọ. 
r 


Therefore, the equations of a conical helix are given by 
r = cetra sinag, (4.28) 


As shown in figure 4.8, a conical helix projects into the plane as a logarithmic 
spiral. Many sea shells and other natural objects in nature exhibit neatly such 
conical spirals. The picture shown here is that of lobatus gigas or caracol pala, 
previously known as strombus gigas. The particular one is included here with 
certain degree of nostalgia, for it has been a decorative item for decades in our 
family. The shell was probably found in Santa Cruz del Islote, Archipelago de 
San Bernardo, located in the Gulf of Morrosquillo in the Caribbean coast of 
Colombia. In this densely populated island paradise, which then enjoyed the 
pulchritude of enchanting coral reefs, the shells are now virtually extinct as the 
coral has succumbed to bleaching with rising temperatures of the waters. The 
shell shows a cut in the spire which the island natives use to sever the columellar 
muscle and thus release the edible snail. 


106 CHAPTER 4. THEORY OF SURFACES 


Fig. 4.8: Conical Helix. 


4.3 The Second Fundamental Form 


Let x = x(u“) be a coordinate patch on a surface M. Since x, and x, are 
tangential to the surface, we can construct a unit normal n to the surface by 
taking 

Xy XX 
n=—* |. (4.29) 
[Xu x Xv]| 


u=const. 


Fig. 4.9: Surface Normal 


Now, consider a curve on the surface given by uf = uf (s). Without loss 
of generality, we assume that the curve is parametrized by arc length s so 
that the curve has unit speed. Let e = {T, N, B} be the Frenet frame of the 
curve. Recall that the rate of change VrW of a vector field W along the curve 

dw 


correspond to the classical vector w’ = 9”, so VW is associated with the vector 


dw. Thus the connection equation Ve = ew is given by 


0 —kds 0 
d(T, N, B] = [T, N, B] | «ds 0 —T ds (4.30) 
0 Tds 0. 


Following ideas first introduced by Darboux and subsequently perfected by 
Cartan, we introduce a new orthonormal frame f = {T,g, n, } adapted to the 
surface, where at each point, T is the common tangent to the surface and to 


4.3. THE SECOND FUNDAMENTAL FORM 107 


the curve on the surface, n is the unit normal to the surface and g = n x T. 
Since the two orthonormal frames must be related by a rotation that leaves the 
T vector fixed, we have f = eB, where B is a matrix of the form 


1 0 0 
B= |0 cos@ —sin@]. (4.31) 
0 sinf cos 


We wish to find Vf = f@. A short computation using the change of basis 
equations W = B~'wB + B~'dB (see equations 3.48 and 3.51) gives: 


0 —Kk cos ds —Ksinéds 
d(T, g,n| = [T, g, n] | Kcos0 ds 0 —tds+d6} , (4.32) 
Ksindds tds—dd 0 
0 —Kgds —k, ds 
= ([T,g,n] | Kk, ds 0 =T ds | , (4.33) 
Knds Tgds 0 


where: 
Kn = K Sin @ is called the normal curvature, 
Kg = &cos@ is called the geodesic curvature; Kg = kgg the geodesic curvature 
vector, and 
Tg = T — dO/ds is called the geodesic torsion. 

We conclude that we can decompose T’ and the curvature « into their 
normal and surface tangent space components (see figure 4.10) 


T’ = Knn + Kgg, (4.34) 
K = Ka + Ke. (4.35) 


The normal curvature kp measures the curvature of x(u°(s)) resulting from the 
constraint of the curve to lie on a surface. The geodesic curvature kg measures 
the “sideward” component of the curvature in the tangent plane to the surface. 
Thus, if one draws a straight line on a flat piece of paper and then smoothly 
bends the paper into a surface, the line acquires some curvature. Since the line 
was originally straight, there is no sideward component of curvature so Kg = 0 
in this case. This means that the entire contribution to the curvature comes 
from the normal component, reflecting the fact that the only reason there is 
curvature here is due to the bend in the surface itself. In this sense, a curve on a 
surface for which the geodesic curvature vanishes at all points reflects locally the 
shortest path between two points. These curves are therefore called geodesics 
of the surface. The property of minimizing the path between two points is a 
local property. For example, on a sphere one would expect the geodesics to be 
great circles. However, travelling from Los Angeles to San Francisco along one 
such great circle, there is a short path and a very long one that goes around 
the earth. 


108 CHAPTER 4. THEORY OF SURFACES 


If one specifies a point p € M and a direc- 
tion vector Xp € TpM, one can geometrically 
envision the normal curvature by considering 
the equivalence class of all unit speed curves 
in M that contain the point p and whose 
tangent vectors line up with the direction of 
X. Of course, there are infinitely many such 
curves, but at an infinitesimal level, all these 
curves can be obtained by intersecting the 
surface with a “vertical” plane containing the 
vector X and the normal to M. All curves in Fig. 4.10: Curvature 
this equivalence class have the same normal 
curvature and their geodesic curvatures vanish. In this sense, the normal cur- 
vature is more of a property pertaining to a direction on the surface at a point, 
whereas the geodesic curvature really depends on the curve itself. It might be 
impossible for a hiker walking on the undulating hills of the Ozarks to find a 
straight line trail, since the rolling hills of the terrain extend in all directions. It 
might be possible, however, for the hiker to walk on a path with zero geodesic 
curvature as long the same compass direction is maintained. We will come back 
to the Cartan structure equations associated with the Darboux frame, but for 
computational purposes, the classical approach is very practical. 

Using the chain rule, we se that the unit tangent vector T to the curve is given 
by 


dx dx du“ du 

= = a m 4.36 
ds du® ds i ds ( ) 
To find an explicit formula for the normal curvature we first differentiate equa- 
tion (4.36) 


T= 


> dT 
E = a 
d du“ 
~ ds (Xa ds ), 
du ue 
= gg Xa) Ge + Xa Gar 
dxa duf . du u! 
~ (Tuf ds” ds ae 
du® duf i dut 


Xab Js ds “da? ` 
Taking the inner product of the last equation with the normal and noticing that 
< Xa, D >= 0, we get 


< T’, n >=< x n> Ge dul 
Kn = 3 = AP» $ 
g ds ds 
bagdu” du? 
= 4.37 
Jagdu” dub ’ ( ) 


where 
bap =< Xag,n > (4.38) 


4.3. THE SECOND FUNDAMENTAL FORM 109 


4.3.1 Definition The expression 
II = bag du® ® duf (4.39) 
is called the second fundamental form . 


4.3.2 Proposition The second fundamental form is symmetric. 

Proof In the classical formulation of the second fundamental form, the proof 
is trivial. We have bag = bga, since for a C™ patch x(u%), we have Xag = Xga, 
because the partial derivatives commute. We will denote the coefficients of the 
second fundamental form as follows: 


a bii =< Xw, 0 >, 
= by =< Xw,n>, 
= by =< Xu, n >, 


g = boa =< Xpy 1 >, 
so that equation (4.39) can be written as 
II = edu? + 2fdudv + gdv?. (4.40) 


It follows that the equation for the normal curvature (4.37) can be written 
explicitly as 


II edu? + 2fdudu + gdv? 
ane (4.41) 
I Edu? + 2Fdudv + Gdv? 


We should pointed out that just as the first fundamental form can be represented 
as 
I =< dx,dx >, 


we can represent the second fundamental form as 
IIT=— < dx,dn >. 


To see this, it suffices to note that differentiation of the identity, < xa, n >= 0, 
implies that 
< Xap, N >= — < Xang >. 


Therefore, 


<dx,dn> = <xqdu%, nodu’ >, 
= < xadu®“, ngdu® >, 
E T du“du®, 
= —<Xagg,n> du“du’ 
= —-II. 


110 CHAPTER 4. THEORY OF SURFACES 


4.3.3 Definition Directions on a surface along which the second fundamental 
form 


e du? +2f dudv +g dv? =0 (4.42) 


vanishes, are called asymptotic directions, and curves having these directions 
are called asymptotic curves. This happens for example when there are straight 
lines on the surface, as in the case of the intersection of the saddle z = xy with 
the plane z = 0. 


For now, we state without elaboration, that one can also define the third fun- 
damental form by 


III =< dn, dn >=< na, ng > du“du?. (4.43) 
From a computational point a view, a more useful formula for the coefficients 


of the second fundamental formula can be derived by first applying the classical 
vector identity 


AC AD 
(Ax B): (Cx D)=| #6 Bees (4.44) 
to compute 
[xu X xol? = (xu X Xv): (Xu X Xv), 
Ri Ku Xy Ky 
= EG- F°. (4.45) 


Consequently, the normal vector can be written as 


Xu X Xy Xr X Xo 


“Sur VEF. 


It follows that we can write the coefficients bag directly as triple products 
involving derivatives of (x). The expressions for these coefficients are 


(XuXvXuu) 
VEG — F?” 
(XuXyXuv) 
VEG — F?” 
(XuXvXvv) 


s = a (4.46) 


4.3.4 Example Sphere 
Going back to example 4.21, we have: 


4.3. THE SECOND FUNDAMENTAL FORM 111 


Xoo = (asin 6 cos ¢, —asin@ sin ¢, —a cos 0), 


Xo = (—acos 0 sin d, a cos 0 cos ¢, 0), 
Xop = (—asin 0 cos ¢, —a sin @ sin ¢, 0), 
n = (sin 8 cos ¢, sin 0 sin ¢, cos 0), 

e = Xo : n = —a, 
f=xeg:n=0, 
g = Xg ` N = —a sin" 0, 
II = = 
A 


The first fundamental form on a surface measures the square of the distance 
between two infinitesimally separated points. There is a similar interpretation 
of the second fundamental form as we show below. The second fundamental 
form measures the distance from a point on the surface to the tangent plane 
at a second infinitesimally separated point. To see this simple geometrical 
interpretation, consider a point Xọ = x(ug) E€ M and a nearby point x(ug + 
du“). Expanding on a Taylor series, we get 


1 
x(ug + du“) = xo + (Xo)adu“ + 5 (o)agdu*du? +... 


We recall that the distance formula from a point x to a plane which contains 
Xo is just the scalar projection of (x — Xo) onto the normal. Since the normal to 
the plane at xo is the same as the unit normal to the surface and < xa, n >= 0, 
we find that the distance D is 


D = <x-—xo,n>, 
1 
= 3 < (Xo)ag,0 > du“du?, 
1 


= <Ilp. 
9770 


The first fundamental form (or, rather, its determinant) also appears in calculus 
in the context of calculating the area of a parametrized surface. If one considers 
an infinitesimal parallelogram subtended by the vectors x,du and x,dv, then 
the differential of surface area is given by the length of the cross product of 
these two infinitesimal tangent vectors. That is, 


dS = ||Xu X X»|| dudv, 
S | | VPG=F aude. 


The second fundamental form contains information about the shape of the 
surface at a point. For example, the discussion above indicates that if b = 
|bag| = eg — f? > 0 then all the neighboring points lie on the same side of the 
tangent plane, and hence, the surface is concave in one direction. If at a point 
on a surface b > 0, the point is called an elliptic point, if b < 0, the point is 
called hyperbolic or a saddle point, and if b = 0, the point is called parabolic. 


II 


112 CHAPTER 4. THEORY OF SURFACES 


4.4 Curvature 


The concept of curvature and its relation to the fundamental forms, con- 
stitute the central object of study in differential geometry. One would like to 
be able to answer questions such as “what quantities remain invariant as one 
surface is smoothly changed into another?” There is certainly something in- 
trinsically different between a cone, which we can construct from a flat piece 
of paper, and a sphere, which we cannot. What is it that makes these two 
surfaces so different? How does one calculate the shortest path between two 
objects when the path is constrained to lie on a surface? 

These and questions of similar type can be quantitatively answered through 
the study of curvature. We cannot overstate the importance of this subject; 
perhaps it suffices to say that, without a clear understanding of curvature, 
there would be no general theory of relativity, no concept of black holes, and 
even more disastrous, no Star Trek. 

The notion of curvature of a hypersurface in R” (a surface of dimension n — 
1) begins by studying the covariant derivative of the normal to the surface. If the 
normal to a surface is constant, then the surface is a flat hyperplane. Variations 
in the normal are indicative of the presence of curvature. For simplicity, we 
constrain our discussion to surfaces in R3, but the formalism we use is applicable 
to any dimension. We will also introduce in the modern version of the second 
fundamental form. 


4.4.1 Classical Formulation of Curvature 


The normal curvature kn at any point on a surface measures the deviation 
from flatness as one moves along a direction tangential to the surface at that 
point. The direction can be taken as the unit tangent vector to a curve on 
the surface. We seek the directions in which the normal curvature attains the 
extrema. For this purpose, let the curve on the surface be given by v = v(u) 
and let \ = oe Then we can write the normal curvature 4.41 in the form 


II*  e+2fr+4+ 9 


En = Te BLOPRE GA?’ a0) 


where JI* and J* are the numerator and denominator respectively. To find the 
extrema, we take the derivative with respect to À and set it equal to zero. The 
resulting fraction is zero only when the numerator is zero, so from the quotient 
rule we get 

I*(2f + 2gd) — II* (2F + 2G)) = 0. 


It follows that, 
able 2. of RGN 
n= OFFO 
On the other hand, combining with equation 4.47 we have, 


p — CETINA gA) _ f+ga 
eS (Bae PA EON F+GA 


(4.48) 


4.4. CURVATURE 113 


This can only happen if 


= ftg  e+fà 
F+GA E+FXN 


(4.49) 


n 


Equation 4.49 contains a wealth of information. On one hand, we can eliminate 
Kn Which leads to the quadratic equation for A 


(Fg — gF)X? + (Eg — Ge)à + (Ef — Fe) = 0. 


Recalling that à = dv/du, and noticing that the coefficients resemble minors of 
a 3 x 3 matrix, we can elegantly rewrite the equation as 


du? —dudv dv? 


E F G || =0. (4.50) 
e f g 
Equation 4.50 determines two directions ge along which the normal curvature 


attains the extrema, except for special cases when either bag = 0, or bag and 
Jap are proportional, which would cause the determinant to be identically zero. 
These two directions are called principal directions of curvature, each associated 
with an extremum of the normal curvature. We will have much more to say 
about these shortly. 


On the other hand, we can write equations 4.49 in the form 


(e — Ekin) + A(f — Fkn) = 0, 
(f = Frm) + A(g= Grn) = 0. 


Solving each equation for A we can eliminate À instead, and we are lead to a 
quadratic equation for kp which we can write as 


e— Ekn f—-— Fn 


faa: oaks = 0. (4.51) 


It is interesting to note that equation 4.51 can be written as 


Ili -=le all-e 


In other words, the extrema for the values of the normal are the solutions of 
the equation 


[bag — Kngagl| = 0. (4.52) 


Had it been the case that gag = dag, the reader would recognize this as a 
eigenvalue equation for a symmetric matrix giving rise to two invariants, that 
is, the trace and the determinant of the matrix. We will treat this formally in 
the next section. The explicit quadratic expression for the extrema of kn is 


(EG — F*)x? — (Eg — 2F f + Ge)rn + (eg — f?) =0. 


114 CHAPTER 4. THEORY OF SURFACES 


We conclude there are two solutions k; and «ə such that 


K = KiRo =a FO (4.53) 


and 
1 Eg—2Ff + Ge 


2 EF-G 
The quantity K is called the Gaussian curvature and M is called the mean 
curvature. To understand better the deep significance of the last two equations, 
we introduce the modern formulation which will allow is to draw conclusions 
from the inextricable connection of these results with the linear algebra spectral 
theorem for symmetric operators. 


M= $ (Ky t K2) = (4.54) 


4.4.2 Covariant Derivative Formulation of Curvature 


4.4.1 Definition Let X be a vector field on a surface M in R and let N 
be the normal vector. The map L, given by 


LX =-VXxN, (4.55) 


is called the Weingarten map. Some authors call this the shape operator. The 
same definition applies if M is an n-dimensional hypersurface in R”*!. 

Here, we have adopted the convention to overline the operator V when it 
refers to the ambient space. The Weingarten map is natural to consider, since it 
represents the rate of change of the normal in an arbitrary direction tangential 
to the surface, which is what we wish to quantify. 


4.4.2 Definition The Lie bracket [X,Y] of two vector fields X and Y ona 
surface M is defined as the commutator, 


[X,Y] = XY - YX, (4.56) 
meaning that if f is a function on M, then [X,Y](f) = X(Y(f)) — Y(X(f)). 


4.4.3 Proposition The Lie bracket of two vectors X,Y € X (M) is another 
vector in X (M). 

Proof If suffices to prove that the bracket is a linear derivation on the space 
of C® functions. Consider vectors X,Y € X (M) and smooth functions f, g in 
M. Then, 


X, YIG +9) = X(Y(F+g9))-Y(X(F +g9)), 
X(Y(f) +Y (9)) = Y(X (F) + X(9)), 
= X(Y(f))—Y(X(f)) + XY (9)) -Y(X (9)), 
= [X%,Y](f) + [X,Y](9), 


4.4. CURVATURE 115 


and 
[X,Y](f9) = X(V(fg9))-Y(X(fg)), 
= X[fY(g)+9Y(f)] — YIFX(9) + 9 X(A)I, 
= X(f)Y(g) + fX(Y(g)) + X(9)Y(f) + 9X(Y(f)), 
—Y(f)X(g) — FY (X(9)) — Y(g)X(f) -IY (XP), 
= f[X(Y(9)) -Y(X + oI XY (A) -YX 
= f[X,Y](g)+91X, Y](f) 


4.4.4 Proposition The Weingarten map is a linear transformation on & (M). 
Proof Linearity follows from the linearity of V, so it suffices to show that 
L: X — LX maps X € 2(M) to a vector LX € Z(M). Since N is the 
unit normal to the surface, < N,N >= 1, so any derivative of < N,N > is 0. 
Assuming that the connection is compatible with the metric, 


Vx<N,N> = <VxXN,>4+<N,VxN>, 
= 2<VxN,N>, 
= 2<-LX,N>=0. 


Therefore, LX is orthogonal to N; hence, it lies in X (M). 

In the preceding section, we gave two equivalent definitions < dx,dx >, 
and < X,Y > of the first fundamental form. We will now do the same for the 
second fundamental form. 


4.4.5 Definition The second fundamental form is the bilinear map 


II(X,Y) =< LX,Y >. (4.57) 


4.4.6 Remark The two definitions of the second fundamental form are con- 
sistent. This is easy to see if one chooses X to have components xq and Y 
to have components xg. With these choices, LX has components —n, and 
II(X,Y) becomes bag = — < Xa, Ng >. 

We also note that there is a third fundamental form defined by 


TTX Y Sex, LY >=< L?X,Y >. (4.58) 


In classical notation, the third fundamental form would be denoted by < 
dn,dn >. As one would expect, the third fundamental form contains third 
order Taylor series information about the surface. 


4.4.7 Definition The torsion of a connection V is the operator T such that 
VX,Y, 
T(X,Y) = VxY —VyX — [X,Y]. (4.59) 


116 CHAPTER 4. THEORY OF SURFACES 


A connection is called torsion-free if T(X,Y) = 0. In this case, 
VxY —VyX = [X,Y]. 


We will elaborate later on the importance of torsion-free connections. For the 
time being, it suffices to assume that for the rest of this section, all connections 
are torsion-free. Using this assumption, it is possible to prove the following 
important theorem. 


4.4.8 Theorem The Weingarten map is a self-adjoint endomorphism on 
Proof We have already shown that L: 2M —> XM is a linear map. 
Recall that an operator L on a linear space is self-adjoint if < LX,Y >=< 
X,LY >, so the theorem is equivalent to proving that the second fundamental 
form is symmetric (II[X,Y] = II[Y,X]). Computing the difference of these 
two quantities, we get 
II(X,Y)—II(Y,X) = <LX,Y>-<LY,X>, 
= < -VxN,Y >x< -Vy N, X Dn 
Since < X, N >=< Y,N >= 0 and the connection is compatible with the 
metric, we know that 
< -VxN,Y > = < N,VxY >, 
< -Vy N, X > = < N, Vy X >, 
hence, 
II(X,Y) —II(Y,X) <N,VyX >—<N,VxY>, 
= < N,VyX-VxY >, 
< N, [X,Y] >, 
0 (iff [X,Y] € T(M)). 


II 


The central theorem of linear algebra is the spectral theorem. In the case of 
real, self-adjoint operators, the spectral theorem states that given the eigenvalue 
equation for a symmetric operator 


LX =KX, (4.60) 


on a vector space with a real inner product, the eigenvalues are always real and 
eigenvectors corresponding to different eigenvalues are orthogonal. Here, the 
vector spaces in question are the tangent spaces at each point of a surface in R3, 
so the dimension is 2. Hence, we expect two eigenvalues and two eigenvectors: 


LX, = k1 Xı (4.61) 
LXə = K2 X9. (4.62) 


4.4.9 Definition The eigenvalues kı and k2 of the Weingarten map L are 
called the principal curvatures and the eigenvectors Xı and Xə are called the 
principal directions. 


4.4. CURVATURE 117 


Several possible situations may occur, depending on the classification of the 
eigenvalues at each point p on a given surface: 


1. If kı Æ k2 and both eigenvalues are positive, then p is called an elliptic 
point. 


2. If kiko < 0, then p is called a hyperbolic point. 
3. If ky = Ko Æ 0, then p is called an umbilic point. 
4. If kı k2 = 0, then p is called a parabolic point. 


It is also known from linear algebra, that in a vector space of dimension two, 
the determinant and the trace of a self-adjoint operator are the only invari- 
ants under an adjoint (similarity) transformation. Clearly, these invariants are 
important in the case of the operator L, and they deserve special names. In 
the case of a hypersurface of n-dimensions, there would n eigenvalues, counting 
multiplicities, so the classification of the points would be more elaborate 


4.4.10 Definition The determinant K = det(Z) is called the Gaussian cur- 
vature of M and H = 4Tr(L) is called the mean curvature . 

Since any self-adjoint operator is diagonalizable and in a diagonal basis the 
matrix representing L is diag(k1, 2), if follows immediately that 


K = K162, 
1 
H = z1 + k2). (4.63) 


An alternative definition of curvature is obtained by considering the unit 
normal as a map N : M -+ S?, which maps each point p on the surface M, to 
the point on the sphere corresponding to the position vector Np. The map is 
called the Gauss map. 


4.4.11 Examples 
1. The Gauss map of a plane is constant. The image is a single point on S?. 
2. The image of the Gauss map of a circular cylinder is a great circle on S$?. 


3. The Gauss map of the top half of a circular cone sends all points on the 
cone into a circle. We may envision this circle as the intersection of the 
cone and a unit sphere centered at the vertex. 


4. The Gauss map of a circular hyperboloid of one sheet misses two an- 
tipodal spherical caps with boundaries corresponding to the circles of the 
asymptotic cone. 


5. The Gauss map of a catenoid misses two antipodal points. 


The Weingarten map is minus the derivative N, = dN of the Gauss map. That 
is, LX = —N,(X). 


118 CHAPTER 4. THEORY OF SURFACES 
4.4.12 Proposition Let X and Y be any linearly independent vectors in 
X (M). Then 
LXxLY = K(XxY), 
(LX xY)+(X x LY) 2H(X xY). (4.64) 


Proof Since LX,LY € X (M), they can be expressed as linear combinations 
of the basis vectors X and Y. 


LX = a, X + bY, 
LY = ag X + bY. 
computing the cross product, we get 
LXxLY = |“ "|XxY, 
ag b 


Similarly 


(a, + be )(X x Y), 
Tr(L)(X x Y), 


(2H)(X xY). 
4.4.13 Proposition 
ia, MOE: 
EG — F? 
1 Eg—2Ff +eG 
H = 4. 
2 EG- F? ew) 


Proof Starting with equations (4.64), take the inner product of both sides 
with X x Y and use the vector identity (4.44). We immediately get 


<LX,X > 
< LY,X > 


< LX,Y > 
<LX,X > 


<X, X> 
<Y,X > 


+ 


<LX,X > < LX,Y > 


Eoo <Y,Y > 
2H = 


<X, Y>] ’ 
<Y,Y > 
<X,X > <X,Y> 


ZV X > e 
. (4.67) 


<Y,X > <Y,Y > 


eee a] 


The result follows by taking X = x, and Y = x,. Not surprisingly, this is 
in complete agreement with the classical formulas for the Gausssian curvature 
(equation 4.53) and for the mean curvature (equation 4.54. 


4.4. CURVATURE 119 


If we denote by g and b the matrices of the fundamental forms whose compo- 
nents are gag and bag respectively, we can write the equations for the curvatures 
as: 


K = det (2) = det(g~'b), (4.68) 


2H = Tr (2) = Tr(g~'b) (4.69) 


4.4.14 Example Sphere 

From equations 4.21 and 4.3 we see that K = 1/a? and H = 1/a. This is totally 
intuitive since one would expect kı = K2 = 1/a because the normal curvature 
in any direction should equal the curvature of great circle. This means that 
a sphere is a surface of constant curvature and every point of a sphere is an 
umbilic point. This is another way to think of the symmetry of the sphere in 
the sense that an observer at any point sees the same normal curvature in all 
directions. 


The next theorem due to Euler gives a characterization of the normal curvature 
in the direction of an arbitrary unit vector X tangent to the surface M at a 
given point. 


4.4.15 Theorem (Euler) Let X; and Xə be unit eigenvectors of L and let 
X = (cos@)X, + (sin 0)X2. Then 


II(X, X) = kı cos? 0 + Kz sin? 0. (4.70) 


Proof Simply compute II(X,X) =< LX,X >, using the fact the LX, = 
Ky X, , LX2 = K2 Xo, and noting that the eigenvectors are orthogonal. We get 


< LX, X > = < (cos O)K XY + (sin 0)Kk2Xo, (cos 0) Xi + (sin 0) X2 > 
= kıcos?ð < Xi, Xı > +kesin?@ < X2, Xo > 


= kıcos? 0 + kosin? 0. 


4.4.16 Theorem The first, second and third fundamental forms satisfy the 
equation 


III —-2HII+KI=0 (4.71) 


Proof The proof follows immediately from the fact that for a symmetric 2 
by 2 matrix A, the characteristic polynomial is x? — tr(A)« + det(A) = 0, and 
from the Cayley-Hamilton theorem stating that the matrix is annihilated by its 
characteristic polynomial. 


120 CHAPTER 4. THEORY OF SURFACES 


Fig. 4.11: Surface Frame. 


4.5 Fundamental Equations 


4.5.1 Gauss-Weingarten Equations 


As we have just seen for example, the Gaussian curvature of sphere of radius 
a is 1/a?. To compute this curvature we had to first compute the coefficients 
of the second fundamental form, and therefore, we first needed to compute 
the normal to the surface in R3. The computation therefore depended on the 
particular coordinate chart parametrizing the surface. 

However, it would reasonable to conclude that the curvature of the sphere 
is an intrinsic quantity, independent of the embedding in R. After all, a 
“two-dimensional” creature such as ant moving on the surface of the sphere 
would be constrained by the curvature of the sphere independent of the higher 
dimension on which the surface lives. This mode of thinking lead the brilliant 
mathematicians Gauss and Riemann to question if the coefficients of the second 
fundamental form were functionally computable from the coefficients of the first 
fundamental form. To explore this idea, consider again the basis vectors at each 
point of a surface consisting of two tangent vectors and the normal, as shown in 
figure 4.11. Given a coordinate chart x(u%), the vectors Xa live on the tangent 
space, but this is not necessarily true for the second derivative vectors Xag. 
Here, x(u“) could refer to a coordinate patch in any number of dimensions, so 
all the tensor index formulas that follow, apply to surfaces of codimension 1 
in R”. The set of vectors {Xa, n} constitutes a basis for R” at each point on 
the surface, we can express the vectors Xag as linear combinations of the basis 
vectors. Therefore, there exist coefficients T}; and cag such that, 


Xap = T} gXy + Capn. (4.72) 


Taking the inner product of equation 4.72 with n, noticing that the latter is a 
unit vector orthogonal to x, we find that Cag =< Xag, n >, and hence these are 
just the coefficients of the second fundamental form. In other words, equation 
4.72 can be written as 

Xap = Ty gx, + bapn. (4.73) 


Equation 4.73 together with equation 4.76 below, are called the formule of 
Gauss. The covariant derivative formulation of the equation of Gauss follows 


4.5. FUNDAMENTAL EQUATIONS 121 


in a similar fashion. Let X and Y be vector fields tangent to the surface. We 
decompose the covariant derivative of Y in the direction of X into its tangential 
and normal components 


VxY =VxY +h(X,Y)N. 
But then, 


h(X,Y) =< VxY,N >, 
=-—<Y,VxN>, 
=—<Y¥,LX,>, 
=- < LX,Y >, 
=II(X,Y). 


Thus, the coordinate independent formulation of the equation of Gauss reads 
VxY =VxY +II(X,Y)N. (4.74) 


The quantity VxY represents a covariant derivative on the surface, so in that 
sense, it is intrinsic to the surface. If a(s) is a curve on the surface with tangent 
T = aæ' (s), we say that a vector field Y is parallel-transported along the curve 
if VrY = 0. This notion of parallelism refers to parallelism on the surface, not 
the ambient space. To illustrate by example, Figure 4.12 shows a vector field 
Y tangent to a sphere along the circle with azimuthal angle 0 = 7/3. The 
circle has unit tangent T = a’ (s), and at each point on the circle, the vector Y 
points North. To the inhabitants of the sphere, the vector Y appears parallel- 
transported on the surface along the curve, that is VrY = 0. However, Y is 
clearly not parallel-transported in the ambient R space with respect to the 
connection V. 


The torsion T of the connection V is defined exactly 
as before (See equation 4.59). 


T(X,Y) =VxY —-VyX - [X,Y]. 


Also, as in definition 3.14, the connection is compat- 
ible with the metric on the surface if 


Vx <Y,Z >=< VxY,Z >+ <Y,VxZ >. Fig. 4.12: 


A torsion-free connection that is compatible with the 
metric is called a Levi-Civita connection. 


4.5.1 Proposition A Levi-Civita connection preserves length and angles 
under parallel transport. 
Proof Let T = a'(t) be tangent to curve a(T), and X and Y be parallel- 


122 CHAPTER 4. THEORY OF SURFACES 


transported along a. By definition, VrX = VrY = 0. Then 


Vr < X, X > =< VrX, X >4+<X,VrX >, 
=2 < VrX,xX >=0, 
> ||X|| = constant. 
Vr < X,Y > =< VrX,Y >+ < X,VrY >=0, 
=< X,Y >= constant. So, 
cos 0 = sL AR = constant. 
|X] (YI 
If one takes {ea} to be a basis of the tangent space, the components of the 
connection in that basis are given by the familiar equation 


Veeg = I .gey- 


The I’’s here are of course the same Christoffel symbols in the equation of Gauss 
4.73. We have the following important result: 


4.5.2 Theorem Ina manifold {M,g} with metric g, there exists a unique 
Levi-Civita connection. 


The proof is implicit in the computations that follow leading to equation 
4.76, which express the components uniquely in terms of the metric. The entire 
equation (4.73) must be symmetric on the indices af, since Xag = Xga, SO 
Tyg = Pig is also symmetric on the lower indices. These quantities are called 
the Christoffel symbols of the first kind. Now we take the inner product with 
Xo to deduce that 


< Xap, Xo >= nee <Xy,Xo >, 
= Py e910 
= Pago; 


where we have lowered the third index with the metric on the right hand side 
of the last equation. The quantities Tago are called Christoffel symbols of the 
second kind. Here we must note that not all indices are created equal. The 
Christoffel symbols of the second kind are only symmetric on the first two 
indices. The notation Tago = [a8,0] is also used in the literature. 

The Christoffel symbols can be expressed in terms of the metric by first 
noticing that the derivative of the first fundamental form is given by (see equa- 
tion 3.34) 


o 
Jay,B = ul < Xa, X} >, 
=< Xab, Xy > + < Xa, X78; >, 
= lagy + Typa. 


4.5. FUNDAMENTAL EQUATIONS 123 


Taking other cyclic permutations of this equation, we get 


Joy,B = Tapy +l vba; 
IBy,0 = Vapy + Prag, 
Joby = Vays + Typa- 


Adding the first two and subtracting the third of the equations above, and 
recalling that the I’s are symmetric on the first two indices, we obtain the 
formula 


1 
Lasy = 5 (Gar,8 + 967,0 — Ia8,7): (4.75) 


Raising the third index with the inverse of the metric, we also have the follow- 
ing formula for the Christoffel symbols of the first kind (hereafter, Christoffel 
symbols refer to the symbols of the first kind, unless otherwise specified.) 


o 1 oO 
aß 7 39 (Jay,B + Ipy,a — Jap,y): (4.76) 
The Christoffel symbols are clearly symmetric in the lower indices 


Unless otherwise specified, a connection on {M, g} refers to the unique Levi- 
Civita connection. 

We derive a well-known formula for the Christoffel symbols for the case 
T“ag. From 4.76 we have: 


1 
ap = 99° (ayp + 967, — Jap, )- 


On the other hand, 
IF Ipy,a = I" Gop y 


as can be seen by switching the repeated indices of summation a and g, and 
using the symmetry of the metric. The equation reduces to 


1 
ab = gI Jane 


Let A be the cofactor transposed matrix of g. From the linear algebra formula 
for the expansion of a determinant in terms of cofactors we can get an expression 
an expression for the inverse of the metric as follows: 


det(g) = gayA™, 
ddet(q) 


= AN, 
Oday 
AM 
ay — 
I det(g)’ 
1 Odet(g) 


124 CHAPTER 4. THEORY OF SURFACES 


so that 
> 1 Odet(g) Ə 
oB ~ 9 det(g) Ogi Dup ser 
1 a) 


~ 2det(g 2det(g) due (det(g)). (4.78) 


Using this result we can also get a tensorial version of the divergence of the 
vector field X = ve, on the manifold. Using the classical covariant derivative 
formula 3.25 for the components v“, we define: 


DivX =V-X= Ula (4.79) 
We get 


Div X = vg Hay v7, 
0% 1 o 


RT 2det(g) Bur 


= Sea aun VAR), (4.80) 


If f is a function on the manifold, df = f g du’ so the contravariant components 
of the gradient are 


det(g))v" 


(VA) Sofa: (4.81) 


Combining with equation above, we get a second order operator 


Af= nae 
= eG a (,/det(g)9% fa) (4.82) 


The quantity A is called the Laplace-Beltrami operator on a function and it 
generalizes the Laplacian of functions in R” to functions on manifolds. 


4.5.3 Example Laplacian in Spherical Coordinates 
The metric in spherical coordinates is ds? = dr? + r? d0? + r? sin? 0 dd?, so 


1 0 0 1 0 0 
Jag = |0 7? 0 gf =]|0 $ 0 : det(g) = r?° sin 0. 
0 0 r?sin?ð 0 0 zi 


The Laplace-Beltrami formula gives, 


_ 1 ð piers 22 OF ) 33 OF 
At = Ta [aot detag Bar) ee gaa VERGO 55) 


1 af. a 1 Of Dai 1 2) 


~ 72 sind FG BY Sy )+ ag" ony r2 00 Od r2 sin? 6 Od 


_ 10 ( 20f 1 Of. of 1 Of 
se (r 21) r? sin 0 00 (sin 055) + tag ooh 


)+ a5 


4.5. FUNDAMENTAL EQUATIONS 125 


The result is the same as the formula for the Laplacian 3.9 found by differential 
form methods. 


4.5.4 Example 
As an example we unpack the formula for T}. First, note that det(g) = 
\|Gae|| = EG — F?. From equation 4.76 we have 


1 
Ti = 59 iy E94 — 911,7), 

1 

= 59 (291 = 911,7), 
1 

= z9 (2911.1 z 911,1) + g” (2912.1 = 911,2)]; 

1 

= — —_|GH, — F(2F, — FE), 
TTC! ( )] 

T GE, — 2FF, + FE, 


2(EG — F?) 


Due to symmetry, there are five other similar equations for the other I’s. Pro- 
ceeding as above, we can derive the entire set. 


GE, — 2F F, + FE, 2EF,— EE, — FE, 


A = 2an 
a 2(EG — F?) aa 2(EG — F?) 
Tİ — GE» Ty FG, r2 = EG, = FE, 
12" 2(EG — F?) 12" 2(EG — F?) 
2GF, — GG, — FG EG, — 2FF, + FG 
1 _ v u v De v v u 
Tn = 2(EG — F?) l2 = X{EG- F?) ` ee) 


They are a bit messy, but they all simplify considerably for orthogonal systems, 
in which case F = 0. Another reason why we like those coordinate systems. 


4.5.5 Example Harmonic functions. 
A function h on a surface in R? is called harmonic if it satisfies: 


Ah=0. (4.84) 


Noticing that the matrix components of the inverse of the metric are given by 


a 1 G =F 
og F E| eee 


we get immediately from 4.82, the classical Laplace-Beltrami equation for sur- 
faces, 


Koc, ail {2 p= ð p= ue 
— VEG- F? |du| VEG- F?| Ov| VEG-F?]J ` i 


126 CHAPTER 4. THEORY OF SURFACES 


If the coordinate patch is orthogonal so that F = 0, the equation reduces to: 


VG Oh VE ðh 
VE ðu VG av 


If in addition E = G = A? so that the metric has the form, 


ð 
Ou 


ð 
' Ov 


=0 (4.87) 


ds? = X? (du? + dv’), (4.88) 
then, 
1 [h oh 


Hence, A?h = 0 is equivalent to V?h = 0, where V? is the Euclidean Lapla- 
cian. (Please compare to the discussion on the isothermal coordinates example 
4.5.14.) Two metrics that differ by a multiplicative factor are called conformally 
related. The result here means that the Laplacian is conformally invariant un- 
der this conformal transformation. This property is essential in applying the 
elegant methods of complex variables and conformal mappings to solve physical 
problems involving the Laplacian in the plane. 

For a surface z = f(x,y), which we can write as a Monge patch x =< 
t,y,f(e,y) >, we have E = 1 + f2, F = 2fzfy and G = 1 + f2=0. A short 
computation shows that in this case, the Laplace-Beltrami equation can be 
written as, (compare to equation 5.43) 


+ 
1HRS [RR] ARR 


or in terms of the Euclidean R? del operator V =< 4, P >; 


ð fe 8 fy 2 
ðs | Arpt) w|i] 
Vf 


=0. (4.90) 


vile IN Fl? 


4.5.2 Curvature Tensor, Gauss’s Theorema Egregium 


A fascinating set of relations can be obtained simply by equating xgy5 = 
Xgéy- First notice that we can also write ng in terms of the frame vectors. This 
is by far easier since < n,n >= 1 implies that < na, n >= 0, so ng lies on the 
tangent plane and it is therefore a linear combination the tangent vectors. As 
before, we easily verify that the coefficients are the second fundamental form 
with a raised index 

Ny = —b} x4. (4.91) 


4.5. FUNDAMENTAL EQUATIONS 127 


These are called the formule of Weingarten. 

Differentiating the equation of Gauss and recursively using the formulas of 
Gauss 4.73 and Weingarten 4.91 to write all the components in terms of the 
frame, we get 


Xg = PG 5Xa + besn, 


X57 = TBs Xa + PesXay + bgs yn + bssny 


= PG 5,7Xa + V5 Pay Xp } bayn] t bgs yn bg5b) Xa 
Xpoy = LBs, + Thal ay = bosb ]Xo + [Liba + bg5,7|n, (4.92) 
X878 = [T3485 + ro Ta = bgybs |Xa + [L3 bas + bgy ln. (4.93) 
The last equation above was obtained from the preceding one just by permut- 


ing ô and y. Subtracting that last two equations and setting the tangential 
component to zero we get 


R?348 — basbs as bps, (4.94) 
where the components of the Riemann tensor R are defined by 
Rgs = T38, — UBy,6 + ieee — ToT Su (4.95) 


Technically we are not justified at this point in calling R a tensor since we have 
not established yet the appropriate multi-linear features that a tensor must 
exhibit. We address this point in a later chapter. Lowering the index above we 
get 

Rapy = besbey = beybas- (4.96) 


4.5.6 Theorema egregium Let M be a smooth surface in R. Then, 


Rı212 


fe ee) 


i (4.97) 


Proof Let a = y = 1 and 6 = ô = 2 above. The equation then reads 


Rı212 = b22b11 — b21b12, 
= (eg — f’), 
= K(EF - G°), 
= K det(g) 


The remarkable result is that the Riemann tensor and hence the Gaussian 
curvature does not depend on the second fundamental form but only on the 
coefficients of the metric. Thus, the Gaussian curvature is an intrinsic quan- 
tity independent of the embedding, so that two surfaces that have the same 
first fundamental form have the same curvature. In this sense, the Gaussian 
curvature is a bending invariant! 


128 CHAPTER 4. THEORY OF SURFACES 


Setting the normal components equal to zero gives 
PG 5bay = T3ybas + bBs,y _ be~,6 =0 (4.98) 


These are called the Codazzi (or Codazzi-Mainardi) equations. 

Computing the Riemann tensor is labor intensive since one must first obtain 
all the non-zero Christoffel symbols as shown in the example above. Consid- 
erable gain in efficiency results from a form computation. For this purpose, 
let {e1, e2,e3} be a Darboux frame adapted to the surface M, with e3 = n. 
Let {01,02,03} be the corresponding orthonormal dual basis. Since at every 
point, a tangent vector X € TM is a linear combination of {e1, e2}, we see 
that 0°(X) = 0 for all such vectors. That is, 0? = 0 on the surface. As a 
consequence, the entire set of the structure equations is 


do’ = —wi, AG, (4.99) 
de? = -—w AG, (4.100) 
d8? = —wi AG — wi, AG? =0, (4.101) 
dwh = =w} Aw, Gauss Equation (4.102) 
dw = —w Aw’, Codazzi Equations (4.103) 
dw, = —w*, A wh. (4.104) 
The key result is the following theorem 
4.5.7 Curvature form equations 
dw, = KO Ae, (4.105) 
wh AO? Hw A0' =—-2H OA 86?. (4.106) 


Proof By applying the Weingarten map to the basis vector {e1, e2} of TM, 
we find a matrix representation of the linear transformation: 


Le, = —Ve,e3 = —w} (e1)e1 — wh (e1)ez, 


Lez = —V e,€3 = —w'3(€2)e1 — ws (e2)32. 


Recalling that w is antisymmetric, we find: 


K = det(L) = —[w eee = w'3(€2)wS(e1)], 
= —(ws : w )(e1, e2), 


= = dwt a e2). 


Hence 
dwh = KO A0. 


4.5. FUNDAMENTAL EQUATIONS 129 


Similarly, recalling that 6'(e;) = ôi, we have 


(wh AO? + w3, ^ 0!) (e1, e2) = wh (e1) — w3 (e2), 
= w} (e1) + w3 (e2), 
= Tr(L) = -2H 


4.5.8 Definition A point of a surface at which K = 0 is called a planar 
point. A surface with K = 0 at all points is called a flat or Gaussian flat 
surface. A surface on which H = 0 at all points is called a minimal surface. 


4.5.9 Example Sphere Since the first fundamental form is I = a? d0? + 
a? sin? 6 dọ?, we have 


6' = ado, 
6? = asin 8 dọ, 
dé”? = a cos 0 dé A dọ, 
= — cos do A 0t = =w} A0}, 


w? = cos 0 do = —w4, 
1 
dw = sind dé ^ dọ = a (ad0) A (asin 0 dø), 
1 1 2 


1 


a2 


4.5.10 Example Torus 


Using the the parametrization (See 4.24), 


x = ((b + a cos 0) cos ¢, (b + a cos 0) sin ¢, asin 0), 


the first fundamental form is 


ds? = a? d0? + (b + a cos 0)’ dg’. 


130 CHAPTER 4. THEORY OF SURFACES 


Thus, we have: 


6! = a dð, 
6? = (b+ a cos 0) dé, 
dé? = —a sin 0 dé A dé, 
= sin f d A 0t = —w* A}, 
wł = -— sin f dọ = —wh, 
EET E OE D E 
A ~ a(b +acos@) i 
cos 0 
= ——————— 0 Ae? 
a(b + acos@) , 
cos 0 
K = —__. 
a(b + acos6@) 
This result makes intuitive sense. 
. ; 1 : 
When 6 = 0, the points lie on the outer equator, so K = ———~ > 0 is the 
a(b+ a) 


product of the curvatures of the generating circle and the outer equator circle. 
The points are elliptic. 


When 0 = 7/2, the points lie on the top of the torus, so K = 0. The points 
are parabolic. 


When @ = 7, the points lie on the inner equator, so K = < 0 is the 


a(b—a 
product of the curvatures of the generating circle and the inner equator circle. 
The points are hyperbolic. 


4.5.11 Example Orthogonal parametric curves 


The examples above have the common feature that the parametric curves are 
orthogonal and hence F = 0. Using the same method, we can find a general 
formula for such cases. Since the first fundamental form is given by 


I = Edu? + Gd’. 


4.5. FUNDAMENTAL EQUATIONS 131 


We have: 
0! = VE du, 
0? = VG dv, 
do’ = (VE), dv A du = —(VE), du A dv, 


(VE) 


= age e = =—wh Ad’, 


do? = (VG), du A dv = —(VG)y dv A du 


VG)u 
=- du A 8? = =w? AG,” 
w= (VE), du (V@)u dv 
VG VE 
ð f 1 wG ð f 1 OVE 
laz | 
dwa = u (Fs a du A dv 4 Jo (z Jo ) dondu 
1 ð 1 avG)\ a 1 OVE pian 
VEG |0u\ VE Ou ðv \ VG Ov 


Therefore, the Gaussian Curvature of a surface mapped by a coordinate patch 
in which the parametric lines are orthogonal is given by: 
1 


ð (f 1 0VG\ _a( 1 OVE 
VEG |0u\ VE Ou ' ðv \ V/G dv 
Again, to connect with more classical notation, if a surface described by a 


coordinate patch x(u, v) has first fundamental for given by I = E du? +G dv?, 
then 


(4.107) 


dx = X, du + X, dv, 


= ŽL VE du + VG dv, 


wE VG 
— E Xy 8?, 
Be ve" 
dx = e; 0! + @ 6”, (4.108) 
where 
e1 = foal e2 = 2 x 
VE’ VG 


Thus, when the parametric curves are orthogonal, the triplet {e1,e2,e3 = n} 
constitutes a moving orthonormal frame adapted to the surface. The awkward- 
ness of combining calculus vectors and differential forms in the same equation 
is mitigated by the ease of jumping back and forth between the classical and 
the modern formalism. Thus, for example, covariant differential of the normal 
in 4.104 can be rewritten without the arbitrary vector in the operator LX as 
shown: 


132 CHAPTER 4. THEORY OF SURFACES 


Vxe3 = w'3(X) e1 + w73(X) €2, (4.109) 
dez = e1 w's + e2 w? = 0, (4.110) 


The equation just expresses the fact that the components of the Weingarten 
map, that is, the second fundamental form in this basis, can be written as some 
symmetric matrix given by: 


wis =l 89! +m 0°, 


ws =m! +n. (4.111) 
If E = 1, we say that the metric 
ds? = du? + G(u,v)dv’, (4.112) 


is in geodesic coordinates. In this case, the equation for curvature reduces even 


further to: , 
1 vG 
taS at ; (4.113) 


The case is not as special as it appears at first. The change of parameters 


a. VE du 
0 


results on dé? = E du?, and thus it transforms an orthogonal system to one with 
E = 1. The parameters are reminiscent of polar coordinates ds? = dr? +r? dd?. 
Equation 4.113 is called Jacobi’s differential equation for geodesic coordinates. 


A slick proof of the theorema egregium can be obtained by differential forms. 
Let F: M —> M be an isometry between two surfaces with metrics g and g 
respectively. Let {ea} be an orthonormal basis for dual basis {0%}. Define 
ča = Fea. Recalling that isometries preserve inner products, we have 


< €g,€8 >=< Frea, Freg >=< ea, e8 >= dag. 


Thus, {ča} is also an orthonormal basis of the tangent space of M. Let ĝe 
be the dual forms and denote with tilde’s the connection forms and Gaussian 
curvature of M. 


4.5.12 Theorem (Theorema egregium) 


a) F*0a = 9a, 
c) F*K=K. 
Proof 


a) It suffices to show that the forms agree on basis vectors. We have 


F" faep) = fa (Freg), 


4.5. FUNDAMENTAL EQUATIONS 133 


b) We compute the pull-back of the first structure equation in M: 
6° + &% A Ô? =0, 
F*d6* + F*% A F*6° =0, 
d0% + F*H%, A 0° =0, 
The connection forms are defined uniquely by the first structure equation, so 


c) In a similar manner, we compute the pull-back of the curvature equation: 


ds’, = K P ne, 
F* dix, = (F* K) F*0" a F*6?, 
dF*i', = (F* K) F*0' a F*6?, 
dw'y = (F*K)6' A 0, 
So again by uniqueness, F*K = K. 


4.5.13 Example Catenoid - Helicoid 

Perhaps the most celebrated manifestation of the theorema egregium, is that 
of mapping between a helicoid M and a catenoid M . Let a = c, and label the 
coordinate patch for the former as x(u®“) and y(ŭ®“) for the latter. The first 
fundamental forms are given as in 4.25 and 4.26. 


ds? = du? + (u? + a”) dv”, E=1, G=u? +0’, 
2 with . 12 3 
d3? = — di? +? dò? B=, 
a? —a y= 
Let F : M+ M be the mapping y = Fx, defined by &? = u? + a? and õ = v. 
Since ŭ dù = u du, we have wi dù? = u? du? which shows that the mapping 
preserves the metric and hence it is an isometry. The Gaussian curvatures Kr 
and K follow from an easy computation using formula 4.107. 


peo 82 ae (4.114) 
Ju? + a2 Ou \ Ou i ~ (u? + a?)2’ ` 


K= 


Ja sa TEIE 2 
L a a (= a __4 (4.115) 


q4 


It is immediately evident by substitution that as expected F*K = K. Figure 
4.13 shows several stages of a one-parameter family M; of isometries deforming 
a catenoid into a helicoid. The one-parameter family of coordinate patches 
chosen is 

Z; = (cost) x + (sint) y (4.116) 


134 CHAPTER 4. THEORY OF SURFACES 


EE. 


Fig. 4.13: Catenoid - Helicoid isometry 


Writing the equation of the coordinate patch z in complete detail, one can 
compute the coefficients of the fundamental forms and thus establish the family 
of surfaces has mean curvature H independent of the parameter t, and in fact 
H = 0 for each member of the family. We will discuss at a later chapter the 
geometry of surfaces of zero mean curvature. 


4.5.14 Example Isothermal coordinates. 
Consider the case in which the metric has the form 


ds? = X? (du? + dv’), (4.117) 


so that E = G = \?, F = 0. A metric in this form is said to be in isothermal 
coordinates. Inserting into equation 4.107, we get 


ka 1 [3 /1əå 4 ð (10d 
~ 2 1 Au LA du Ov \A Ov) |’ 
1/00 ð ð 


Hence, 


1 
K= -3V (nd). (4.118) 


The tantalizing appearance of the Laplacian in this coordinate system gives 
an inkling that there is some complex analysis lurking in the neighborhood. 
Readers acquainted with complex variables will recall that the real and imag- 
inary parts of holomorphic functions satisfy Laplace’s equations and that any 
holomorphic function in the complex plane describes a conformal map. In an- 
ticipation of further discussion on this matter, we prove the following: 


4.5.15 Theorem Define the mean curvature vector H = Hn. If x(u, v) is 
an isothermal parametrization of a surface, then 


Xuu + Xyy = 2X°H. (4.119) 


Proof Since the coordinate patch is isothermal, E = G = \? and F = 0. 
Specifically, we have < Xu, Xu >=< Xv, X, >, and < Xu, Xy >= 0. Differentia- 
tion then gives: 


4.5. FUNDAMENTAL EQUATIONS 135 


< Xuu;, Xu > =< Xv, Xvu >, 
= — < Xw, Xu >, 


< Xuu + Xvv,Xu > =Q. 
In the same manner, 


< Xw, Xv > =< Xu, Xw >, 
= — < Xw, Xv >, 


< Xuu + Xyy, Xv >=0. 


If follows that x,,,,-+Xy, is orthogonal to the surface and points in the direction 
of the normal n. On the other hand, 


Eg+Ge _ 
2EG 
gte 
=H 
2d? Í 
e+g= 2H, 


< Xuu + Xw, N > = 2° H, 
Xuu + Xvy = 2A? H. 


Chapter 5 


Geometry of Surfaces 


5.1 Surfaces of Constant Curvature 


5.1.1 Ruled and Developable Surfaces 


We present a brief discussion of surfaces of 
constant curvature K = 0. since K = k,ko, a sur- 
face with zero Gaussian curvature at each point 
must have a principal direction with zero normal 
curvature, that is, either kı = 0 or k2 = 0. It 
is therefore a necessary condition for a surface to 
have K = 0, that at each point there be a prin- 
cipal direction which is a straight line. A surface 
having this property of containing a straight line 
or segment of a straight line at each point is called 
a ruled surface. We may think of a ruled surface as a surface generated by the 
motion of a straight line. Given a point p on a ruled surface, let a(t) be a 
curve with a(0) = p, and let X(t) be a unit vector field on the curve and point- 
ing along the lines at their intersection points with the curve. On can then 
parametrize the surface near p by a coordinate patch 


y(t,v) = a(t) +vX (t), 
as shown in the figure. 


Having a straight line passing through each point in the surface is a necessary 
but not sufficient condition to ensure that K = 0, as illustrated by the following 
examples: 

1) Saddle. Consider the saddle z = xy which is trivially parametrized by 
the coordinate patch y(u, v) = (u,v, wv). The patch can be written as y(u, v) = 
(u, 0,0) + v(0,1,%) or as y(u,v) = (0,v,0) + u(1,0,v), so that the surface is 
doubly-ruled as shown in figure 5.1(a). The rulings are the coordinate curves 
u= constant, and v= constant. This neat fact is reflected in some architectural 
designs of simple structures with roofs made of straight slabs arranged in the 


136 


5.1. SURFACES OF CONSTANT CURVATURE 137 


shape of a hyperbolic paraboloid. A short computation gives 
K= —(1 +u? +)? = =(1 +r? +y? 


2) Hyperboloid. A common calculus example of a doubly ruled surface 
is given by circular hyperboloids of one sheet. Consider the circle a(u) = 
(cosu,sinu,0), and the vector field X (u) = & + k which points at a constant 
skew angle 7/4. Then the (x,y,z) coordinates in the parametrization 


y(u, v) = a(u) + vX (u), 
= (cos u, sin u, 0) + v(— sin u, cos u, 1), 


= (cosu — vsin u, sin u + v cos u, v), 


satisfy the equation x? +y?—z? = 1; that is, the surface is a circular hyperboloid 
of one sheet. The coordinate curves u =constant are straight line generators. 
If instead, we choose X(u) = —a@ +k, we get the same surface, but with an 
orthogonal set of line-generators as shown in figure 5.1(b). This is an example of 
a surface in which the asymptotic curves are orthogonal at each point. Tangent 
planes to the surface at any point in the circle z? + y? = 1 at the throat, 
intersect the hyperboloid in a pair of line generators. The Gaussian curvature 
is also negative and is given by 


K = —(1+4 2u”)-? = —(1 4 227)? 


The double-ruled nature of the circular hyperboloid has been exploited by civil 
engineers in the design of heavy-duty gears with long teeth engaging along the 
lines. The double-ruling is also advantageous for the construction of the metal 
frame a type of tower to used in nuclear reactors. 

3) Mobius Band. The formal definition of an orientable Surface M is that 
there exists a two-form that is non-zero at every point of M. The idea is that the 
2-form represents the oriented differential of surface area dS = \/det g du A dv. 
For the present purpose, an intuitive characterization is that there exists a unit 
normal vector field on M. The Mobius Band is the most famous example of a 
non-orientable surface. It can be parametrized by the coordinate patch 


y(u, v) = a(u) + vX (u), 
a(u) = (cos 2u, sin 2u, 0), 
X (u) = (cos u cos 2u, cos u sin 2u, sin u), 
x(u, v) = (cos 2u + v cos u cos 2u, sin 2u + v cos usin 2u, v sin u). (5.1) 


The curve a(u) is a circle, and the vector X (u) on the circle points in a direction 
that winds around by an angle m in one revolution. In the rendition of the 
surface shown in figure 5.1, the parameter v is restricted to [—0.2,0.2]. As 
is evident from the graph, the generating line segment flips after one turn, 
resulting on a one-sided surface. Indeed, at u = 0 the generating line segment is 
vX (0) = v(1, 0,0) but after one revolution at u = 7, the generating line segment 
given by vX (7) = v(—1,0,0), points in the opposite direction. We can interpret 
the topology of the surface as a rectangle with a pair of opposite sides identified 


138 CHAPTER 5. GEOMETRY OF SURFACES 


(see figure 9.2). Clearly, it is impossible to choose a well-defined normal vector 
field. The Gaussian curvature is somewhat messy, but the computation shows 
that K is negative everywhere. 


a) Saddle b) Hyperboloid 


Fig. 5.1: Examples of Ruled Surfaces 


Let M be a ruled surface with unit normal N and let X be a unit vector 
field tangent along the straight lines that generate the surface. The straight 
lines are geodesic in R so Vx X = 0. By Gauss’s equation 4.74, VxX = 0, 
so the line is also a geodesic on the surface and < LX, X >= 0, that is, the 
generator lines are asymptotic. Let Y be a unit tangent vector field orthogonal 
to X so that the pair constitutes an orthogonal basis of the tangent space at 
each point. Then 


K =< LX,X >< LY, Y > — < LX,Y >’, 
=—<LX,Y >?=-f?, 


so we conclude that K < 0. If the vectors X and Y are not an orthogonal basis, 
the result must be modified as in equation 4.66, which gives, 


f? 


a el 
EG — F? 


(5.2) 


The general formula for the Gaussian curvature of a ruled surface is obtained 
by a straightforward computation. We have: 


Yu =a’ + 0X’, 
Yv =X, 
Yu X Yo = (a! + VX") x X, 
EG — F? = |lyu x yo|| = || X x (œ + vX’)|). 
Yuv = Xs Youu = 0. 

Hence, g =< Yw, N >= 0, and f =< Yw, N >= (a' X X')/V EG — F?, where 
we are using the notation for the triple product. The resulting curvature is: 
(a! XX") 


K 
|X x (a! + vX")||* 


(5.3) 


5.1. SURFACES OF CONSTANT CURVATURE 139 


We may choose the orthogonal trajectories a(u) to be integral curves of Y 
parametrized by arc length, so that Y = a’(u) = T. Since by choice, T and 
X are orthogonal unit vectors tangent to the surface, N = T x X is normal 
to the surface and we have an orthonormal frame at each point. The covariant 
derivatives of the frame with respect to T along the one-parameter curves a is 
just derivative with respect to the parameter, so we get a Frenet-like frame 


T = aX + CN 
X = =c T T. c3 N : (5.4) 
ING. = o —c3 X 


A one-line computation gives: 
c3 = — < X, N' >=< N, X' >= (TX X’). 
The function 


(TXX) o 
[X O a? + c3?’ 


plu) = 


is called the distribution parameter. Substituting 5.4 into 5.3, we rewrite the 
Gaussian curvature as 


(TXX’) 
|X x (T +X’ 
C37 


[1 — 2c1v + cy2v? + v2c3?]? | 


K= 


The special curve along which cı = 0 that is, (T’, X) = 0 is called the stricture 
curve. Using the parametrization with the base curve being the stricture curve, 
we have p(u) = 1/c3 and 


c3? = p (u) 


K=- TH 0%) Pat 


(5.5) 


A beautiful example is the hyperboloid of revolution in figure 5.1(b). The circle 
a(t) at the throat used to generate the surface is the stricture curve. It turns 
out that X does not need to be orthogonal to T as it is the case here, as long 
as ||x|| = 1 and (T’, X) = 0. 

A ruled surface is called a developable surface if in addition, LX = Vx N = 
0, that is, the normal vector is parallel along the generating lines. Then equality 
holds in K < 0 and we have the following theorem 


5.1.1 Theorem A necessary and sufficient condition for surface to be devel- 
opable is to have Gaussian curvature K = 0. 

It is surprising that the general case of a closed and connected surface with 
K = 0 to be developable was not proved until 1961 in a short paper by Massey. 
A particularly interesting type of developable surfaces are those in which the 
vector X is taken to be the tangent vector T of the curve a itself. A surface 
with this property is called a tangential developable. 


140 CHAPTER 5. GEOMETRY OF SURFACES 


5.1.2 Example Developable helicoid. A helicoid 
x(u, v) = (ucosv, usin v, v) 


can be written in the form x = a(v)+uX(v), where a(v) = (0,0, v) and X (v) = 
(cosv,sinv,0), so it is a ruled surface. The surface has negative curvature 
as computed in 4.114 and the stricture curve is the z-axis. A neat related 
surface is obtained by the tangential developable of a helix. We choose a(u) = 
(cos u, sinu,u) and X = T = (—sinu,cosu, 1) so that 


x(u, v) = (cosu — vsin u, sin u + v cosu, u + v). (5.6) 


Since this is a flat surface having K = 0 it is isometric to a plane. Indeed, if one 
takes a thin cardboard annulus with a slit in the xy-plane with the appropriate 
radius, one can bend the annulus around a cylinder by lifting one edge of the 
slit, thus creating a ribbon that wraps around the cylinder as shown in figure 
5.2. A magnificent architectural example is exhibited by base of the spiral 
staircase near the pyramid of the Louvre museum. For the maple-generated 
image, a small numerical computation was carried to figure out the vertical shift 
and radius of the helicoid so that the staircase and the supporting developable 
match at the helix of intersection. 


Fig. 5.2: Developable Helicoid 


5.1.2 Surfaces of Constant Positive Curvature 


In this section we prove a few global theorems. We assume the reader is 
acquainted with the notion of a compact space. In particular, in R” a compact 
set is closed and bounded so it is contained in a ball of sufficiently large radius, 
centered at the origin. We are concerned with compact manifolds, which by 
definition are locally Euclidean and have a differentiable structure. Thus, a 
compact surface in R? cannot have any edges or creases, and the tangent space 
is well defined at all points. 


5.1.3 Theorem In any compact surface in R? there exists at least one point 
p at which K (p) > 0. 


5.1. SURFACES OF CONSTANT CURVATURE 141 


Fig. 5.3: Compact Surface 


Proof Let M be compact. Consider the function f : M — R defined by 
f = ||x||, where x are local coordinates of a point on the surface. This is a 
continuous function on compact space, so it attains a maximum at a least one 
point p. The geometric interpretation of p is that it is farthest away from the 
origin. The intuition about this theorem is simply that near the point p, the 
surface is entirely on one side of the tangent plane as shown in figure 5.3, so the 
principal curvatures have the same sign. We make this formal. Let R be the 
distance from p to the origin and construct a sphere of radius R centered at the 
origin. The sphere will be tangential to the surface at p. Given a unit tangent 
T, let a(t) be unit speed integral curve near p, that is, a(0) = p and T = a’(0). 
The composite function f(a(t)) also has a maximum at p, so by the second 
derivative test, we have [f(a)]‘(0) = 0, and [f(a)]”(0) < 0. By the definition 
of f, f(a) = ||a?|| = (a,a), so [f(a)]'(0) = 2(a,a’)(0) = 0. We conclude that 
the position vector x, = a(0) of the point p is orthogonal to T. Since this is 
true for any such T, the vector x, is also normal to the surface, so that the unit 
normal is n = x,/R. Computing the second derivative we get 


zlf(a)]” (0) =< a, a’ > (0), 
=< T,T > + < xo, a” (0) >, 
=1+ R< n,o”(0) >< 0, 


But clearly < n,a” (0) > is the normal curvature along T so 


kn (p) < OR 

Again, since T was arbitrary, the normal curvature is less than —1/R in any 
direction, a geometric indication that the surface is bending inward more than 
the sphere as intuitively shown by the picture. Therefore 


1 
K(p) = kik > Re >0. 


142 CHAPTER 5. GEOMETRY OF SURFACES 


5.1.4 Theorem In a surface in which the coordinate directions are chosen 
to be the principal directions of curvature, the Codazzi equations are 


OK 1 Ey, 


u ap N 
OK 1G, 
om = 5) G (ki K2) (5.7) 


Proof Let X and Y be eigenvectors of L, so that 


<LX,X> eœ 
LX = X — : = 

me AAS E 
_<Ly,Y>_ g 


TiS ee AE 
PES Tae ByVal 


If kı A k2, the eigenvectors are orthogonal, so taken them as the parametric 
directions means that F = f =0. The Codazzi equations 4.98 are obtained 
by setting to zero the normal component of Xagy — Xgay = 0. In terms of the 
covariant derivative formulation of the Gauss 4.74 with X =e, = x,, Y = 
e2 = Xy, Z = e}, the normal component of (Vx Vy — Vy Vx)Z = 0 result in 
the equations of Codazzi in the form: (See 6.22) 


< VxLY -VxLX,Z >=0, 
VxLyY —-VxLX =0 
We proceed to expand this equation: 


Ve, Lez = Ve, Ler = 0, 


Ve, (K2€2) — Veo (K161) = 0, 


Oko OK, 
sr €2 + Kol y2€q eres 


Ou Ou 


e€4°— Kyl 91€a =0. 


Setting the eı and e2 components to zero, we get: 


OK 
a = K1P791 — kal’ = (K1 — K2)T7 19, 
OK 
5 L = KM yo — kirt = (k2 — K1)I*21. 


The result follows immediately from the expressions for the Christoffel symbols 
4.83 after setting F = 0. 


5.1.5 Proposition (Hilbert) Let p be a non-umbilic point and «1(p) > K2(p). 
If xı has a local maximum at p and «2 has a local minimum at p, the K(p) < 0. 
Proof Take the asymptotic curves as parametric curves as in the preceding 
proposition. Suppose «1(p) > K2(p) and that the principal curvatures are local 
extrema. Then (kı)u = (K2)y = 0 so by equation 5.7, we have E, = Gu = 0. 


5.1. SURFACES OF CONSTANT CURVATURE 143 


Applying the second derivative test by differentiating 5.7 at p we get 


1 Ew 


(ki) = 3 F (k2 — K1) <0, 
1 Guu 
(ko)uu = 3G (kı — k2) > 0, 


which implies that Ey > 0, end Guu > 0. On the other hand, noting as above 
that E, = Gu = 0, the Gaussian curvature formula 4.107 gives 


iL 
ar Eyy uu) <0. 
K SRG! + Guu) <0 


5.1.6 Theorem (Liebmann) A compact manifold M in R® of constant Gaus- 
sian curvature K is a sphere of radius R with K = 1/R?. 


Proof Since M is compact, there is at least one point at which K > 0, and 
since K is constant, K > 0 everywhere. We prove by contradiction that all 
points are umbilic. Suppose there exists an non-umbilic point. Without loss of 
generality, we assume that the larger principal curvature is xı. The principal 
curvatures are continuous functions in a compact space, so there is a point p at 
which «1 is maximum. Since K = «42 =constant then at p, k2 is a minimum. 
By Hilbert’s theorem above, K(p) < 0 which is a contradiction. So M is a 
sphere so some radius R and K = 1/R?. 


5.1.3 Surfaces of Constant Negative Curvature 


The geometry of surfaces of constant negative curvature is very rich and it 
has a number of neat applications to physics. If K < 0, then it must be the case 
that the principal curvatures xı and K2 have different signs. All points on the 
surface are hyperbolic, and by Hilbert’s theorem there are no compact surfaces 
of constant negative curvature. In addition, since «kı Æ K2, there always exist 
orthogonal asymptotic curves with asymptotic directions along the eigenvectors 
of the second fundamental form. The prototype of a surface of constant negative 
curvature is the pseudosphere introduced in equation 4.24, which we repeat here 
for convenience. 


x(u, v) = (asin u cos v, asin usin v, a(cos u + In(tan 5) 


144 CHAPTER 5. GEOMETRY OF SURFACES 


To compute the Gaussian curvature we first verify that the first fundamental 
form is as stated in 4.24. We have: 


: cos? u 
X, = (a cos u cos v, a cos u sin v, a 


. 9 


1n u 
X, = (—a sin usin v, a sin u cos v, 0), 


4 
cos* u 
E =a’ cos? u + a°- a 
sin* u 
2 
cos* u 
=a’ cos” u(1 + ——), 
sin* u 
= a* cot? u, 
F=0, 


G =a’ sin’ u. 
So the parametric curves are orthogonal, and 
I =a? cot? u du? + a° sin? u dv”. 


Inserting into formula 4.107, we get 


1 ð / sinu ð, ., 
pits a? cosu E (St F(asinw)] , 


= -——— # fsinw) 
gees Ou 
1 
sos (5.8) 


a 


Another common parametrization of the pseudosphere is obtained by the sub- 
stitution 


u = alntan(}). (5.9) 


Without real loss of generality, we set a = 1, so e” = tan(u/2). The substitution 
is somewhat related to the classical Gudermannian. We have: 


2 ent gu 
sech y = S55 aa aS aan nae 
2 2 tan(u/2) — cot(u/2) 
~ tan(u/2) + cot(u/2)’ and ~ tan(u/2) + cot(u/2)’ 
= 2sin(u/2) cos(u/2), = sin? (u/2) — cos? (u/2), 


= sin u, = — cosu 


In simplifying the equations above we multiplied top and bottom of the fractions 


by sin(u/2) cos(u/2). In terms of the parameter u, the coordinate patch for the 
pseudosphere becomes 


x(u, v) = (asech u cos v, a sech usin v, a(u — tanh u)), (5.10) 


5.1. SURFACES OF CONSTANT CURVATURE 145 


and the fundamental forms are: 
I = tanh? u dy? + sech? u dv?, 
TI =—sechp dp? + sech u tanh u dv. 


Using this latter parametrization we compute the surface area and volume of 
the top half of the pseudosphere. 


s= | | VEGF, 


sp (asech u) (a tanh u) du dv, 
o Jo 


= Ara’, (5.11) 
V = rè f sech? u tanh? u dp, (5.12) 
= fna’. (5.13) 


It is interesting to note that the surface area is exactly the same as that of 
a sphere of radius a, whereas the volume of revolution is half the volume of 
the sphere. Without loss of understanding of the geometry, for the rest of this 
section we set a = 1, so that K = —1 We have the following theorem: 


5.1.7 Theorem Let M be a surface with constant negative curvature K = 
—1. If the parametric curves are chosen to be the asymptotic directions, there 
exists some quantity w so that the first fundamental form can be written as: 


I =cos*w du? + sin? w dv”, (5.14) 


Proof The proof amounts to analyzing the integrability conditions represented 
by the Codazzi equations. In the brilliant book by Eisenhart [19], the author 
writes down the Codazzi equations and notes that the choice of E and G above 
are solutions of the equations. We take a more humble approach and show 
the steps on how to find a solution using the Cartan formalism. We have seen 
that using the asymptotic curves as parametric curves means that F = f = 0, 
kı = E/e and kı = G/e. Let E = a? and G = 8? so that the first fundamental 
form is: 
I =a? du? + B? dv”. 

We choose 61 = a du, and 6? = 6 dv, as basis for the cotangent space dual to 
{e1,e1}. Let e3 be the unit normal to the surface. We have: 


Le, = K161 e1€3 = w's(e1)e;, 


Leg = K2€2 en €3 = W'3(€2)e, 
so, 
ws = k10! = kiQ du, 


Ws = k20? = K2ß dv. 


146 CHAPTER 5. GEOMETRY OF SURFACES 


On the other hand, 


d0! = ay dv A du = —(— du) A 6? = =w! A 0, 


B 
dé? = Bu, du A dv = (= dv) A 0t = =w? AOF, 
wis = a du — oe dv. (5.15) 


1 1 
dwa = —Wy A W's, 
22s = 20 1 
dwg = —w A ws 


Inserting the connection forms into the first Codazzi equation gives 


(K1Q)y dv A du + o du — Bu dv] A k28 dv = 0. 
a 
Since k1K2 = —1, we can eliminate «2 and solve for a. 


(K1Q)y — Aykg = 0, 


(k1 — K2)Qy + (Kilva = 0, 


Oy _ (m) 
a k1 — Ka’ 
eet (K1)y 
kı + (1/1) 
_ Kai(ki)v 
7 k1? +1 , 
o o 
7, na) DET In[(k1? + 1)1/?}. 
We may set 
Kı = tanw, 
k2 = — cot w, (5.16) 


for some w. Then (k1? + 1)!/? = sec w, so 


o o o 

— (lna) = —— In(secw) = — ln(cos w). 

g na) = -g In(secw) = gg In(cosw) 

We choose the simplest solution a = cosw. By a completely analogous compu- 
tation using the second Codazzi equation, we get 8 = sinw and that proves the 
theorem. 


5.1. SURFACES OF CONSTANT CURVATURE 147 


5.1.8 Theorem Ifa surface with K = —1 has first fundamental form writ- 
ten as I = cos?w du? + sin? w du?, then w satisfies the so-called sine-Gordon 
equation (SGE): 

Wuu — Wyy = SİN w COS W. (5.17) 
Proof Here E = cos? w and G = sin? w. The theorem follows immediately from 


inserting these into the Gauss curvature equation in orthogonal coordinates 
4.107 and setting K = —1 


Wuu ~ Wuv = 


K = 


sin w Cos w 


The following transformation is often made: 
u=t+d u=t-d. 
A quick computation yields a transformed fundamental form 
I = di? + 2cos& didé + dé, (5.18) 


where w = W/2. A coordinate system in which the first fundamental form is of 
this type is called a Tchebychev patch (eventually one has to make a choice on 
how to transliterate from the Cyrillic alphabet). The corresponding curvature 
equation is 

Wuy = sinw (5.19) 


The sine-Gordon equation is one of class of very special type of nonlinear partial 
differential equations which admit soliton solutions. This is an incredibly rich 
area of research that would take us into whole new branch of mathematics. We 
constrain our discussion to certain transformations that allow one to obtain 
new solutions from known solutions, and associate these with pseudospherical 
surfaces, that is, surfaces in R3 with constant negative curvature. We note that 
if in the Sine-Gordon equation 5.17 one sets v = t where t is a time parameter, 
what we have is a non-linear wave equation with speed v = 1. The reader will 
then recognize the transformation u = û +Ê u = ù — Í as the equations of 
characteristics. It is thus not surprising that the equation has solutions of the 
form f(u -— t). 


5.1.4 Bäcklund Transforms 


5.1.9 Definition Let M by a surface with K = —1 and let F : M > M be 
a map to another surface W. Let p = F(p) and N(p) and Ñ (p) be the unit 
normals at p and p respectively. M is called a Bäcklund transform (BT) of M 
with constant angle of inclination øg, if for all p: 

a) The angle between N and Ñ is o, 

b) The distance A between p and ĵ is sino, 

c) the segment pp is tangent to M at p. 


Bäcklund proved in 1883, that F maps pseudospherical surfaces to pseudospher- 
ical surfaces and asymptotic lines to asymptotic lines. He also showed that given 


148 CHAPTER 5. GEOMETRY OF SURFACES 


Fig. 5.4: Backlund Transform 


any unit tangent vector at p that is not an asymptotic direction, a BT exists 
with pp in the direction of that tangent. The idea behind the proof of the BT 
theorem is basically to quantify the transformation from an orthonormal frame 
at p to the orthonormal frame at p, find the conditions required for K= —l, 
and write down the integrability conditions for the Cartan structure equations. 
The transformation consists of a rotation by an angle o, a translation from p to 
p and a rotation by an angle 0 to align the frame with segment pp. This could 
be done all at once, but we prefer to carry out the process in two stages. In the 
first stage, we apply the translation of the frame assuming that the segment 
joining p and p is parallel to the basis vector e; at p, followed by a rotation by 
an angle o around the e; direction. We use this to seek conditions to guarantee 
that K = —1. In stage two, we apply a rotation by an angle @ in the tangent 
plane. 


5.1.10 Theorem Let M have Gaussian curvature K = —1, and let x(u, v) a 
coordinate patch for M so that I = E du? + G dv?. Let {e1,e2,e3 = n} be an 
orthonormal frame aligned with the asymptotic directions. Denote the frame 
and Cartan forms at p with hats. Consider the transformation x = x + Ae, 
along with a rotation by an angle o around the e; axis. Then K = —1 if and 
only if A = sing. 

Proof A rotation by an angle o around e; leaves the tangent vector e; and its 
dual form 6! fixed. The rotation has a matrix representation as shown below. 


& 1 0 0 ei 
€| = |0 coso —sing| |e2|. (5.20) 
63 0 sing coso e3 


We compute the Cartan frame equations. We have: The coframe forms are 
0! = VE du, 6? = VG dv and 63 = 0 on M. Thus 


dx = X, du + xX, dv, 


z Fave du ' Fave de, 


x x 
= — ol 4 — 6, 
VE VG 


= e10! + e207. 


5.1. SURFACES OF CONSTANT CURVATURE 149 


We now compute dx taking into account the rotation and then the translation: 

= e10! + [cosa ez — sin ces] 6”, 

dx = dx + Adey, 

= e10! + e20? + A(e2 wai + e3 w31) 

Equating the coefficients of eg and e3 in two equations above, we get 
cosa 62 = 0? + Aw 1, 
-sino 6? = ws. (5.21) 

Recall from equation 4.111, that wt = l0! + mé? and w?3 = m6! + n0? yield 


symmetric matrix components of the second fundamental form in the given 
basis. Using this fact and wedging with 6! the second equation above, we get 


— sino ĝt A ĝ? = A Ô! Aw ,, 
=) 0 A(l0 +m 6?), 


6G = —2™ gt p Q?, (5.22) 


sing 


Multiplying the first equation in 5.21 by sin ø, the second by cos ø, and adding, 
we get 


sing[6? + A w1] =—A cosa wy, (5.23) 


8? =— [sino wi + coso w31]. (5.24) 


sing 
Next, we compute W309: 
w32(X) =< V x ê2, ês >, 
=< cos oV yez — sin o V xez, Sing €z + cosa e3 >, 
= cos? o < Vxeo,e3 > — sin? o < V xes, ez >, 
= (cos? ø + sin? o) < V xez, e3 >, 
= w32(X), 
O = ws = —m 0! — n 0? 
By the same process, we calculate ®31: 
w31(X) =< V xé1, ês >, 
=< Vye? + Vx, sing e? + cosa e >, 
=< w (X)ez + w1 (X)ez, sina e2 + cosa es >, 


= sin ow21(X) + sin ow31 (X), 


150 CHAPTER 5. GEOMETRY OF SURFACES 


Finally, putting these results together, we get 


msing | | sing | 4 A 
= 61 A b? 
| À Bi : 


Hence 


if and only if A = + sin ø. We choose À = sing 

The conclusion of the theorem explains the condition in the definition of 
a BT that requires this equation to hold. With this condition, equation 5.23 
takes the form: 


wla = cota w?; + csco 6. (5.25) 
We move to stage two of the BT process. 
5.1.11 Theorem Let M be a pseudospherical surface with first fundamental 
form as in 5.14, and let F : M —> M be a BT with angle of inclination ø. If 


the segment pp makes a constant angle a with the basis vector e; at each point 
p E€ M, then 


sing(@, +w) = sin@cosw — coso cos sinw, 


sin o (fy + wu) = — cos 0 sinw + cos o sin 0 cos w. (5.26) 


Proof Suppose pp makes an angle 0 with the tangent vector e1. In this case 
we first perform a rotation of axis around the normal vector e3 to align the 
frame with e1. The rotation can be represented by a matrix €; = ej Bi; 


cos? —sinð 0 
B= |sin0 cosð 0j. (5.27) 
0 0 1 


The effect on the Cartan frame is much easier establish since all we have to do 
is apply the change of basis formula 3.48 as shown in 3.51. 


Tla = wia = dé, 
W! = cos 0 w'3 + sind ws, 


W3 = — sin 0 w'3 + cos 8 w3. (5.28) 


The dual forms transform with A = B~! = BT, that is g = Ai ;0. In particu- 
lar 


3 


0 = —sin 6! + cos0 0?. (5.29) 


5.1. SURFACES OF CONSTANT CURVATURE 151 


If we start with a pseudospherical surface with I = cos? w du? + sin? w dv?, the 
Cartan forms are: 


6'=coswdu, 0° = sinw dv, 


wta = sinw du, w*3 = — cosw du 

wla = —w, du — wy, dv. 
The BT transformation is the composition of the stage one and stage two. 
This means that we must subject the Cartan forms to the change of basis by a 
rotation by 0, as described by equations 5.28 and 5.29, followed by substitution 
into equation 5.25. We get: 


(wta — d0) = cota (cos@ w'3 + sin 0 w3) + csca(— sin 0 6! + cos 6 0°). 


We extract two formulas obtained by equating the coefficients of du and dv 
respectively. 


—W, — u = coto cos 0 sinw — csc o sind cos w, 


—Wu — 6, = — cot o sin 0 cos w + csc o cos 0 sin w. 


The theorem follows by multiplying these equations by sing, and rearranging 
terms. The system of equations 5.26 is the classical Backlund transform. In the 
special case in which ø = 7/2, the angle between the normals e3 and é3 is a 
right angle, so é3 is parallel to a tangent vector of M. This is called a Bianchi 
transform. Equations 5.26 then reduce to the much simpler system: 


Ou tws = sinl cosw, 


6, + wu = — cos Ô sin w. (5.30) 


We can rewrite the BT-equations in the so-called asymptotic coordinates. Let 


1 
u=ax+t+t, eee: 
so that 
v=a-t 1 
i t=5(u—v) 


By the chain rule, we have: 


1 

Ou = 3 (9x T 0t), Wu = 
l and 

Ov = 3 (9x = 64), Wy = 


(We + we), 


NlR NIe 


Adding and subtraction equations 5.26, the system reduces to 


1 
Or + wz = a w), 
sing 
Pe 
bi — w: = O57 sin(9 Hw), 


sin o 


152 CHAPTER 5. GEOMETRY OF SURFACES 


which we can rewrite as 


= w+ssin(ð +w), 


1 
Oz = —We + F sin(? — w),, (5.31) 


where s = tan(a/2.) We denote the system of BT equations 5.31 by the notation 
F = F(w,6,8). 

Given a pseudospherical Surface S' associated with a solution w of the sine- 
Gordon equation, the transform F = F(w,0,0) produces a new solution 6 
associated with a new pseudospherical surface S’. Of course, the process can 
be iterated to produce new surfaces and new solutions. The neat thing is that 
further iterations can be carried out algebraically without the need to solve more 
differential equations. This remarkable result is encapsulated in the following 
theorem (see [19]). 


5.1.12 Theorem (Bianchi permutability) Let {5,w} be the pair consisting 
of a pseudospherical surface corresponding to a SGE solution w. Suppose that 
sin? 6, Æ sin? 02, and that {51,01} and {S2,62} are pseudospherical surfaces 
generated respectively from surface S by BT’s 


F(w,61,81) 
a 


S Si; 


F(w,02,52) 
aaiue 


S So. 


Then, the pair {$’,Q} consisting of a pseudospherical surface S” with SGE 
solution Q can be found algebraically by requiring the compatibility of BT’s 


F(01,9,s2) 
Dn, 


Sı S', 


F (02,92,51) S. 


S2 

Proof It suffices to use only one of each pair of BT’s. By assumption, 
(0i) = Wt T S1 sin(61 T w) snd Qi = (91): T 82 sin(Q TF 61) 

(02) = wt + s2 sin(02 + w) Qi = (2)4 + $1 sin (Q + 62) 


Adding the difference of the two equations on the left with the difference of the 
two equations on the right, we see that all derivatives cancel, and we get: 


sı[sin(01 + w) — sin(Q + 02)] + sg[sin(Q + 61) — sin(@2 + w)] = 0. 
If we have quantities A and B such that As; + Bsg = 0, then 


Si B 

s2 A’ 
jipra 2ce 

S2 S2 A 

S1 52 — S1 A+B 
1-2= ; 

S2 S2 A 


5.1. SURFACES OF CONSTANT CURVATURE 153 


Applying this to the equation above, we have 


so+s1 _ [sin(@1 +w) — sin(Q + 62)] — [sin(Q + 61) — sin (62 + w)] 
s2— 8  [sin(@; +w) — sin(Q + 62 [sin(Q + 61) — sin(@2 +w)]’ 
_ [sin(@1 +w) — sin(Q + 61)] + [sin(92 + w) — sin(Q + 62)] 
[sin(01 +w) + sin(Q + 64 [sin(@2 + w) + sin(Q + 62)]’ 


Using the sum-product formulas for sine functions we rewrite the equation as, 


sots,  2sin[$(w —Q)] cos[Z(Q + w + 201)] + 2sin[Z (w — Q)] cos[Z(Q + w + 262)] 
82-81  2cos[4(w — Q)] sin[$(Q +w + 20; 2.cos[4(w — Q)] sin[Z(Q + w + 262)]’ 
2sin[4(Q — w)]{cos[4(Q + w + 201)] + cos[4 (Q + w + 202)]} 
2cos[4(w — Q)]{sin[4(Q + w + 201)] — sin[4 (Q + w + 202))}? 


Now, using the sum-product formulas again, we get, 


sotsi Asin[$(Q — w)]{cos[4 (201 — 262)] cos[$(20 + 2w + 261 + 202)]} 
82-81 Acos[4(w — Q)]{sin(4(201 — 202)) cos[{ (29 + 2w + 201 + 262)]}’ 
m 4sin[$(Q — w)] cos[ (01 — 82)] 
4cos[4(w — Q)] sin[4 (62 — 01)]’ 
We conclude that 
Q—w S2 + 81 b2 — 0 
t = .32 
en ( 5 ) RE tan ( 5 ) (5.32) 


It is easy to write coordinate patch equations for a BT. The vector x — x must 
be a vector of length sino which is tangent to M at each point p and makes 
and angle 0 with e1. Therefore, we must have 


x — x =sino|cos@ e, + sinô eg]. 
But, e1 = x,/VE and eg = x,/VG, so we have: 


9 ind 
k=x4tsing | x, + H yl. (5.33) 


u z 
COS W sın w 


5.1.13 Example Pseudosphere and one-soliton solution 

If we are willing to sacrifice a bit of rigor for the sake of intuition, we can 
motivate the derivation of the standard parametric representation of the pseu- 
dosphere directly from the Bianchi transform. Recall that in the steps leading 
to the SGE, we chose the principal curvatures (see equation 5.16) such that 
kı = tanw and k2 = —cotw. Then, as w approaches zero, kı also approaches 
zero, while k2 becomes arbitrarily large, so as to maintain K = —1. The result 
is a degenerate surface that collapses onto a straight line. We may think of 
it as an infinitely long trumpet of infinitesimal diameter. As such, we pick a 
degenerate patch of the form 


x(u, v) = (0,0, u) 


154 CHAPTER 5. GEOMETRY OF SURFACES 


We set e; = (0,0,1) and, anticipating a surface of revolution, we pick e2 = 
(cos ¢, sin 6,0). Recalling the BT coordinate patch in equation 5.33 


x =x-+sino|cos@ e, + sinô eg], (5.34) 


we consider the case with sino = 1. With w arbitrarily close to zero, the Bianchi 
transform equations 5.30 become, 


Pu = sin, 0, =0. 


We set v = ¢ and without loss of generality, we pick the constant of integration 
in the first equation above to be 0. The elementary integral gives immediately 
the stationary one-soliton solution 


u = ln(tan $), or, @=2tan'(e“). 


The coordinate patch for the corresponding surface then gives, 


& = (0,0, In(tan £)) + cos 6(0,0, 1) + sin (cos ¢, sin ¢, 0), 


= (sin 0 cos ¢, sin 0 sin ¢, cos 0 + In(tan 4)) 


which agrees with the parametrization of the pseudosphere given in equation 
4.24. In other words, the angle @ is precisely as shown in figure 4.6, consistent 
with the geometry of the Bianchi transform stating that the segment joining 
the corresponding points is tangential to the surface generated. 


5.1.14 Example Dini’s surface 

Another well-known surface that can be obtained by BT results in “twisting” 
the pseudosphere in a helicoidal manner. Dini’s surface is obtained by removing 
the special condition sino = 1. Again, if one begins with the trivial solution 
w = 0 of the Sine-Gordon equation 5.19, the BT equations 5.31 reduce to, 


b; = ssin9, 
6, = = sin. 
s 
Integrating, we get 
1 
In(tan £) = sax + =t, (5.35) 
s 


where we have set the constant of integration to 0. Solving for 0, we are lead 
to the moving one-soliton solution 


O(a, t) = 2tan!(es* +s"), 


We carry out the parametrization using the BT equations in 5.26. Again, 
assuming w is arbitrarily close to zero, the equations reduce to 


(sing), = sin 0, (csc 0)0,, = 1/ sin o 
(sing)6, = cos ø sin 0. RE (csc 0)0, = cos o / sin o 


5.1. SURFACES OF CONSTANT CURVATURE 155 


Rewriting the equations in differential form 


1 cos 
d(Intan(&)] = ——du + "dv, 
sino sing 
we can integrate immediately, 
In tan($) = etu cosg =x. (5.36) 
sing 


Here we have used x to denote the expression on the right hand side and we set 
the integration constant to 0. In terms of these coordinates, the moving soliton 
solution is 

O(u,v)) = 2tan™t (eX). 
Using the same degenerate patch 5.34, the parametrization of the surface be- 
comes 


x = (0,0, u) + sina|[cos 6(0, 0, 1) + sin @(cos ¢, sin ¢, 0)], 


= (sino sin 6 cos ¢, sing sin @ sin ¢, u — sing cos @)). 


Finally, using the results in equations 5.10, we rewrite the parametrization of 
Dini’s surface as 


xX = (sin o cos ġ sech x, sino sin @ sin ¢sech x, u — sino tanh y)). (5.37) 


Notice that as expected, when o = 7/2, that is, when sing = 1, we get x = u, 
and the equation reduces to a pseudosphere (see equation 5.10). Another com- 
mon parametrization of Dini’s surface in which the geometry is more intuitive 
is given by: 


x(u, v) = ( acos u cos v, a cos usin v, a(cos u + Intan(5)) + bv ). (5.38) 


This surface has curvature K = —1 when a? + b? = 1, and it has an unfolding 
infundibular shape, as shown in figure 5.5, with parameters u € [0,2], v € [0,47] 
and a = 1, b = 0.5,0.2. The surface is essentially generated by revolving the 
tractrix profile curve of the pseudosphere about the central axis, while at the 
same time translating the curve at a constant rate parallel to the axis. The 
meridians traced by the parametric curves u =constant are helices. When 
b = 0, the equation gives a pseudosphere. 


5.1.15 Example Kuen surface 
Applying the permutability theorem 5.32 to solution 5.1.4 we obtain immedi- 
ately the two-soliton solution 


sit+tt sot+it 
sias e a ee ee 


$2— S11 +e 


Q= 2tan7! (5.39) 


(sı+s2)t+( 5+5 )t 


In this example, we perform a Bianchi transformation of a one-soliton pseudo- 
sphere to obtain a Kuen surface which is associated with a two-soliton solution. 
We begin with the parametrization of a pseudosphere given by 5.10 with a = 1, 


x(u, v) = (sech u cos v, sech u sin v, (u — tanh p)). 


156 CHAPTER 5. GEOMETRY OF SURFACES 


a) Dini, b=0.5 b) Dini, b=0.2 c) Calla Lily 


Fig. 5.5: Dini’s Surface 


Let w = 2tan~+(e”), so that 


u = Intan(w/2), 


Then, 
w= 2tan'(eM), 
sinw = sechy, 
cosw = — tanh yp. 


We will find 0 by solving the Bianchi equations 5.30. We compute: 


2e” 2 h 
= = = sec 
1+e?” elpe ies 


Wy = 0. 


Wy 


Substituting into the Bianchi equations, we get: 
0, = —sin f tanh y, 
6, = — cos @sech u — sech u = — sech pu(1 + cos 0) = —2 cos? (£) sech p. 


Separate variables 
csc 0 0, = — tanh p, 
5 sec? (4) 6, = — sech p, 
and integrate. The result is: 
tan() = —h,(v) sech p, 
tan(4) = —vsech u + ho(p), 
where hı and hg are the arbitrary functions of integration. Consistency of the 
equations requires hı = 1 and h2 = 0. The solution is therefore 


=p 


tan() = —vsech u = T. 


8 = 2tan~'(—vsech u). 


5.1. SURFACES OF CONSTANT CURVATURE 157 


Only the cosine and the sine of the angle 0 enter into the Bianchi coordinate 


patch. Thinking of tan(4) as the ratio of the opposite over the adjacent side 


of the right triangle with hypothenuse \/cosh” u + v2, we can compute the sine 
and cosine from the double angle formulas: 


cosh? u — v? 
cosh? u + v2’ 
—2vu cosh u 


cos 0 = cos? (£) — sin? (£) = 


sin @ = 2sin(£) cos(£) = ; 
(2) cos(3) cosh? u + v2 
It remains to go through the algebraic gymnastics of computing the coordinate 
patch: 
K cos 0 sin 0 
xXx=x+ Xy H= 
cos w sin w 
= ( sech u cos v, sech u sin v, (u — tanh u) ) 


Xv, 


cos 0 


ean ( — sech u tanh pz cos v, — sech tanh p sin v, tanh? u ) 
anh u 
sin é 


( — sech u sin v, sech pcos v, 0 ), 
sech u 


= ( sech u cos v, sech u sin v, (u — tanh u) ) 
+ cos 0( sech u cos v, sech sin v, — tanh u ) 


+ sin 0( — sin v, cos v, 0 ), 


a) Kuen b) Breather c) Breather d) Ochuva 


Fig. 5.6: Surfaces with K= -1 


The x-component of x is: 


X(x) = (1 + cos @) sech p cos v — sin 0 sin v 


cosh? u — v? 2vcoshu |. 
= —_7z FZ | cosu + | —~5—_ | sinv, 
cosh* u + v? cosh* u + v? 
2. cosh? u 2v cosh u 
= 5 5 sinv, 
cosh* u + v? cosh* u + v? 


_ 2cosh p(cos v + vsin v) 


cosh? u + v2 


158 CHAPTER 5. GEOMETRY OF SURFACES 


The computation of the other two components is left as an exercise. The result 
is 


2cosh u(cosv — vsinv) 2cosh (sin v — v cos v) 2sinh 24 
x(u, v) = 5 ; 5 H 3 
cosh* u + v? cosh* u + v? cosh” u + v? 
(5.40) 
The Kuen surface in figure 5.6a is plotted with parameters u € [—1.4, 1.4] and 
v E [—4, 4]. 
As noted by Terng and Uhnlenbeck [38], if in the 2-soliton equation 5.39 


one sets sı = et? and sy = —e~“®, we get a real-valued solution 
_, | sin @sin(7 cos 0) 
=2tan! 5.41 
$ T cosh(€ sin 0) | ’ (4) 


where £ = x — t andy = x +t. This is a periodic solution called a breather. 
A rendition of the surface associated with this solution is shown in figure 5.6, 
using the parametrization derived by Rogers and Schief [31]. 


5.2 Minimal Surfaces 


5.2.1 Minimal Area Property 


In an earlier chapter we defined a minimal surface to be a surface of mean 
curvature H = 0. From the formula for the mean curvature 4.65, a surface in 
R? is minimal if 


2H =Tr(g~'b) =0, which implies Eg —2F f +Ge=0. 


For historical reasons we first consider this condition for a surface with equation 
z= f(x,y). We rewrite in parametric form using a Monge coordinate patch 


x(x, y) — (x,y, f(x,y) ) 


A quick computation yields the coefficients of the first and second fundamental 
forms. 


E=1+ fr’, e = fra/D, 
G=1+f,’, and 9 = fyy/D, 
F= fefy, f = fzy/D, 


where D = VEG — F2 = 4/1 + fe? + i The surface has Gaussian curvature 
K =0 if f(x,y) satisfies the Monge-Ampere equation 


We have already determined that the solutions are developable surfaces. The 
condition H = 0 for having a minimal surface is that f(x,y) satisfies the quasi- 
linear differential equation: 


(1+ fe?) fen — 2fefy fey + (1+ fy?) fyy = 0. (5.42) 


5.2. MINIMAL SURFACES 159 


Using the notation p = fr, q = fy the condition that the surface area over a 
region be minimal, follows from the variational equation: 


ö | | VEGF dydr=5 | | Vi4P +e dyda, 
= ë | | Feza dy dx = 0. 
The Euler-Lagrange equation for this functional is 
V- | Vs | aA k a d 0. 


= + = 
1+|VfP] dz fl+pP+Q? dy y1+pP++E 


It was proved by Lagrange in 1762, that these are equivalent to the condition 
H = 0 exemplified in equation 5.42, but he was unable to find non-trivial 
solutions. In 1776 Meusnier showed that the catenoid and the helicoid satisfied 
Euler-Lagrange equations 5.43 and thus had zero mean curvature. 


(5.43) 


5.2.1 Example Scherk’s surface 
The catenoid and the helicoid remained the only known minimal surfaces until 
1830, when Scherk obtained a new solution under the assumption that f(x,y) 
has the special form. 

f(x,y) = U(x) +V (y) 


In this case, the minimal surface equation 5.42 can be separated 


U” (1 = Vv?) is v"(1 + U’) = 0, 


U" yV” 
= =g, 
1+U?2 14V? 


The ordinary differential equations only involve the first and second derivatives 
of the variables, so they can be easily integrated. First, we let R = U’, and 
solve for U, setting the integration constants to zero. 


R’ 
Tome 
R 
tan! R = Cz, 
U' = tan(Cz) 
U = In[sec Cz]. 
The integral for V is done the same way, and the result is V = — tanfln sec y], 


so 
f(x,y) = In[sec Ca] — In[sec Cy]. 


Setting C = 1 and rewriting in terms of cosines, we get 


(5.44) 


160 CHAPTER 5. GEOMETRY OF SURFACES 


Fig. 5.7: Scherk’s Surface. 


This is the classical, doubly periodic Scherk surface which we render in figure 
5.7. 

Next, we prove the area-minimizing property for more general surfaces with 
zero mean curvature. The integrand in the formula for surface area is the 
square root of the determinant of the first fundamental form, so to perform a 
variation, we will need to take derivatives of determinants. The main idea in 
obtaining a formula for the derivative of a determinant rests on a neat result 
from linear algebra which at the risk of digressing a bit, it is worth proving now. 
This theorem due to Jacobi, will be most important later when we discuss the 
exponential map in the context of Lie groups and Lie algebras (See section 
7.2.1). 


5.2.2 Theorem Let A be a square matrix. Then 
det ef = e™ 4, (5.45) 
Proof First consider the case where A is a diagonal n x n matrix A = 


diag[kı, K1,- -., Kn] by an . Defining e^ by the exponential power series (See 
equation 7.52), it is immediately verified that: 


A : Kin 
e^ = diagle*!, e*?,...,e""], 
det e4 = efte"? e”, 
=e 


Next, consider the case where A is diagonalizable. Then, there exists a similarity 
transformation Q such that A = Q~!DQ, where D is the diagonal matrix with 
the eigenvalues along the diagonal, D = diag[k1,«1,...,4n]. We recall that 
determinant and trace are invariant under similarity transformations. We have 


4? = (Q7'DQ)(Q7"DQ), 
= Q7 D(QQ7')DQ, 
= Q'DIDQ, 
s0 PO: 


5.2. MINIMAL SURFACES 161 


By induction, we can easily prove that A” = Q~'D"Q and hence, 


zij 
e^ = 5 A", 
n=0 
Sani 
=X ae" D"Q, 
n=0 
CO 


1 
=Q ($ 4 D")Q, 
n=0 
=O Ver) 
det A = det Q7! (det e?” ) det Q, 
= dete? = e™P = e™ 4, 


Finally, A might not diagonalizable, but it can be reduced to canonical Jordan 
form J by a similarity transformation A = Q-'JQ. The Jordan matrix J 
has the eigenvalues along the diagonal. with a block structure of the form 
J, = AI + N, where N is nilpotent, it remains true that dete? = e™/. Thus 
the argument above holds in this case as well. 


Back to our topic. If in equation 5.45 we replace A by In A, we get, 


det A = eT (nA) (5.46) 


Suppose that instead of single matrix A, we have a C™, one-parameter family of 
matrices A+. Then, we can use the equation above to differentiate with respect 
to t. 


d — p Tr(ln At) d 

q tA =e a 
d 

= (det A+) Irog PAi 

dá; 

dt 


(Tr(In A;)), hence, 


= (det A,)Tr(A; ) (5.47) 


We unpack this formula for a special kind of variation defined as follows: 


5.2.3 Definition Let x(u,v) : U > R? be a coordinate patch for a surface 
{M,g} defined over a set U C R?, and let 6 : U + R be a C% function. 
A normal deformation of the surface is a one-parameter family of surfaces M 
with coordinate patches of the form x;(u, v) = x(u, v) + td(u, v)n, where t is a 
small parameter, t € [—e, €], and n is the unit normal. 

Let g+ be the matrix of the first fundamental form induced on the surface 
M and let det(g:) denote the determinant. The elements of surface area are 
dS; = ./det(g:) du A dv, and the areas are given by: 


ai= | f as.= | f VI aun de. (5.48) 


162 CHAPTER 5. GEOMETRY OF SURFACES 


At t = 0, go = g and dS, = dS represent the metric and the differential of 
surface area element of M. We have the following theorem: 


5.2.4 Theorem The variation of surface area satisfies 


Al(0) = -2 J : Hds, (5.49) 


so that A; (0) = 0 if, and only if, H = 0. 
Proof The proof is by computation, using the formula 5.47 for derivatives of 
determinants. 


pasif s hies 


Cia (g+) du A dv, 


= | | oem xo (g+) ) dt 
1 494 

= det (g;)Tr 1") du du, 
Hoo e RE 


5f [mots 1 dge ) dS. 


It remains to compute the derivative of the one-parameter family of first fun- 
damental forms along the normal deformation. 


a ho 


dx, =dx+t(¢ddn+dé¢n), 
< dx;,dx, > =< dx,dx > +2tọ < dx, dn > +@(t?), 
I =I=—26t 11 + 0), 
(9i)as = Jap — 2¢t bap + O(P). 


Therefore, 
d 
— gtlt=0 = —2¢ b 
q 9tlt=0 $ 


- f [enw as=-2 | f onus, 


for any function ¢, so this concludes the proof. A more complete proof would 
include analysis of the second variation, but this will not be treated in these 
notes. 


We deduce that 


5.2.5 Example Surface of revolution 
Let M be a surface of revolution that is also a minimal surface. The standard 
coordinate patch is given by 4.7 


x(r,) = (rcos¢,rsin ¢, f(r)) 


5.2. MINIMAL SURFACES 163 


with fundamental form coefficients 


PEDEN ESPTE, 


F=0, and f=0, 
Ger gar VIFF. 


For this to be a minimal surface we must have: 


H = Eg + Ge = 0, 
rf +f?) +r? f” = 0. 


The equation is easily integrated. Let p = f’, separate variables and integrate 
by partial fractions: 


rp’ = p(1 + p°), 
1 1 
dp = ——dr, 
pO +p” ro 
p A 


PTT 


where A is a constant of integration. Squaring both sides and solving for p = f’ 
we get: 


pr? = A?(1 + p?), 


A 
= tS fg 2 i 
p=f a 
f = Acosh! r. 


The conclusion is that a catenoid is the only minimal surface of revolution. The 
mean curvature is not a bending invariant because it depends specifically on 
the second fundamental form. However, it is notable by a short computation, 
that the deformation described in equation 4.116 that bends a catenoid into a 
helicoid, results on a one-parameter family of minimal surfaces z, independent 
of t. A surface of type x = (rcos¢ġ,rsinġ,h(¢)) is a called a conoid. The 
helicoid is the only minimal surface that is also a conoid. 


5.2.2 Conformal Mappings 


In this section we explore the connection between minimal surfaces and 
conformal maps. For this purpose, it will be useful to insert a pedestrian review 
of some basic concepts in complex variables. We denote by C the usual vector 
space of complex numbers of the form z = x + iy. Complex numbers can also 
be represented by antisymmetric matrices 


164 CHAPTER 5. GEOMETRY OF SURFACES 


in which the binary algebraic operations are mapped to matrix operations. 
Thus, for example, multiplying two complex numbers z172, is equivalent to mul- 
tiplying the corresponding matrices. By Euler’s formula, any complex number 
of unit length can be written in the form z = et? = cos@ + isin@. The matrix 
version of a unit vector is a rotation matrix 


cos sin 
—sin@ cos|` 


The set all such matrices forms a group called SO(2). There are two special 
elements in this set that comprise a basis for the vector space, the identity 
matrix J and the symplectic matrix 


72 e, i i (5.50) 


The former corresponds to a rotation matrix with 0 = 0 and the latter to a 
rotation by 0 = 7/2. Clearly J? = —I, showing that J plays the role of the 
imaginary number 7 in the matrix representation. 

A fundamental result from complex variables is that if a function w = f(z) = 
u+iv is differentiable in the complex sense (i.e. holomorphic), then the following 
properties hold: 


1. The real and imaginary parts satisfy the Cauchy-Riemann equations uz = 
Vy; Uy == Uz: 


2. The functions u and v are (conjugate) harmonic: V?u = V?v = 0. 


3. the families of curves u(x, y) =constant and v(x, y) =constant, are mu- 
tually orthogonal. That is, (Grad u, Grad v) = 0. In the context of heat 
flow these curves are called isothermal lines. 


4. The map is conformal, that is, it preserves angles. The conformal factor 
is given by the Jacobian | f’(z)|? = |Vu|? = |Vv|?. In vector component 
notation, if z = (x,y)? and h = (k1, kz)", then by differentiability, 


f(z +h) = f(z) + Df(h) + €, 


where Df is the Jacobian map f’ and € > 0 as h > 0. From the Cauchy 
Riemann equations at any given point, we have 


pr = [ee | fe, 
=( l lel 


i 5 z|cosð —sind] | ky 
ae ee | ile , 


sin 0 cos @ 


for some numbers a, b and some angle 0. Thus, the transformation consists 
locally of a dilation and a rotation. 


5.2. MINIMAL SURFACES 165 


5. Let z = x + iy and Z = x — iy. The chain rule gives: 
— 0. _ t/a Axe) — 0 _1/2 . 0 
0, = ðz 2 (2 ig), Oz = Oz 2 (£+ig). 


6. w = f(z) is holomorphic if and only if df = et = 0. This is equivalent 
to the Cauchy-Riemann equations. 


= dr? 2 — dzdz dAf= = ot 
7. I = dz? + dy zdz, and Af = fra + fyy = 4ar35 


Since R? as a vector space is isomorphic to C, we can extend the complex 
structure to a surface M in R? by requiring that the locally Euclidean property 
be replaced by complex coordinate patches that are holomorphic. This can 
actually be done intrinsically for an oriented surface by introducing a (1,1) 
tensor J : TM — TM so that for an orthogonal basis {e),e2} of the tangent 
space, J(e,) = e2 and J(e2) = —e,. This results on a matrix representation 
of J at each point identical to the symplectic matrix 5.50 and it represents 
a rotation by 7/2. The tensor J can always be introduced in any coordinate 
patch by starting with x, or x, and using the Gram-Schmidt process to find 
an orthogonal vector. An easy computation gives: 


F Ex, — FxXy, 
Vi = Xu; V2 = Xv Ett E , 
F Gx, — Fx, 
Wi =X, Wo=Xy axe = G 7 (5.51) 


Orientation is preserved if the differential of surface satisfies dS(X, J(X)) > 0 
for all X 4 0. Since J represents a rotation by 7/2, there must be constants 
cı and cp such that Jx, = c1(Ex, — Fx,) and Jx, = co(Fx, — Gx,). Setting 
\|xu||? = ||Jxul|? and |x, |]? = ||Jx.||? we find that c1 = c2 = 1//det g, hence, 
the components of J in the parametric coordinate frame are 


J Ex, — Fx, 

Xu = -A 

VEG — F? 

Fx, — GX, 
i eee (5.52) 


VEG- F2 


5.2.6 Definition Let {M,g} and {M",g'} be Riemannian manifolds. A map 
F : M > M' is called conformal if there exists a function A, such that 


SKY >'=X < X,Y >, (5.53) 
for all tangent vectors. If A = 1, the map is an isometry. 
Conformal maps on manifolds as defined above have the following properties: 
1. | FXI? = |Al?||X||?. The reason is that ||X||? =< X, X >. 


2. F preserves angles, since at each point, the A’s cancel in the formula for 
the cosine of the angle as in 1.51 and 4.17. 


3. F,(J(X)) = J(F,(X)). This is easily shown by applying the maps to 
a basis and using the fact that F preserves angles and J represents a 
rotation. 


166 CHAPTER 5. GEOMETRY OF SURFACES 


5.2.3 Isothermal Coordinates 


An isothermal system as in 4.117 in which the metric takes the form ds? = 
à? (du? + dv?) can be considered as a map from U C R? as a plane surface, to 
M. This is an example of the most basic conformal map. As previously stated 
in theorem 4.5.15, the coordinate patch in this case satisfies the equation, 


Xuu + Xov = 2A°H. 


The conclusion of this theorem can be restated by saying that that surface in 
isothermal coordinates is a minimal surface if and only if its coordinate patch 
functions are harmonic in the usual Euclidean sense. 


5.2.7 Definition Given an isothermal, conformal patch x(u,v), we call a 
map y(u,v) a conjugate patch if it satisfies the Cauchy-Riemann equations 


Xu = Yv, Xv =—Yu- (5.54) 


In complex variables, the real and imaginary parts of a holomorphic function 
are conjugate harmonic functions. Given one of these, one can determine the 
other by integrating the Cauchy-Riemann equations. In a similar way, given 
an isothermal conformal patch, we may determine a conjugate patch and this 
conjugate patch is also isothermal and conformal. The conjugate patches can 
be rendered as the real and imaginary parts of a complex holomorphic patch. 


5.2.8 Definition Given conjugate harmonic patches x and y, we define the 
associated family to be the one-parameter family 


zı = Re" (x + iy)] = (cos t)x + (sin t)y. (5.55) 


5.2.9 Existence of isothermal coordinates 

The existence of isothermal coordinates is more subtle and requires some har- 
monic analysis. Given {M,g} with a coordinate patch x(u,v), we seek a map 
F(p) = (h(p), k(p)), in which the metric is isothermal. The condition that the 
parametric curves be orthogonal means we must have (Grad h,Grad k) = 0. 
Using the definition of the gradient 2.26, and recalling the components of the 
inverse of the metric 4.85, we get 


(Grad h, Grad k) = gag(Vh)*(Vk)?, 
= (VA)*(VE)as 
ee 
~ det(g) 

1 


= rg Oh — Fhy) — k, (Fh, — Eh,)| 


The equation (Grad h, Grad k) = 0 holds if there exists a function A so that 


[ku(Ghy — Fhy) + ky (Eh, — Fha )], 


ku =(Fhy — Ehe), ky = (Ghy — Fhy). 


5.2. MINIMAL SURFACES 167 


To get |Vk|? = Vh|? we set A = 1/,/det(g). The integrability condition kuv = 
kyu yields the classical Laplace-Beltrami equation 4.86 


a [Ga — Fhe) ð = = 
du | VEG- F?2| v |/EG— F2| 


Hence, the existence of non-trivial solutions is tied to the harmonic analysis on 
the existence of solutions of the elliptic operator. 

A more classical approach for the analytic case was obtained by Gauss by 
the simple but ingenious method of factoring the first fundamental form ds? = 
E du? + 2F dudv + G dv’, 


nS IFZ 
vi ER ad. BG ere | 


T VE U + VE 


The idea is to find coordinates h and k and a conformal factor A, such that 


ay per; 2 
Jemy aN P 


ds? = X? dh dk. 


This would suffice, since having found h and k, one could set h = 6+ iy and 
k = ġ — ip. Following Eisenhart [19], let tı and t2 be integrating factors of the 
system 


; a 2 
EVs open ec A | EE 
VE 
— 74/ _ 2 
(Vea s E e Tio) = th = hudut yd 


where A? = 1/(tıt2). The first equation above is equivalent to, 


F+iH 
VE = h, n( = hy, 
VE 


where H = ,/det(g) = VEG — F?. Solving for tı in the first equation and 
substituting on the second, we get 


h,(F +iH) = Ehy, (5.56) 
_ Eh, — Fh, 


On the other hand, multiplying the first equation above by F — iH we get 


hy(F? + H?) = E[F —iH]h,, 
hy(F? + EG — F?) = E(F — iH)h,, 
EGh, = E(F —iH)hy. 
Hence 
_ Fhy — Gh, 


168 CHAPTER 5. GEOMETRY OF SURFACES 


The integrability condition hy, = hy, for equations 5.57 and 5.58 gives again 
the Laplace Beltrami equation 


a [a Fhe) ð =a E 
ðu | VEG- F?| dv|J/EG— F?| _ 


By a completely analogous computation, one can verify that k satisfies the same 
integrability condition. 

The existence of isothermal coordinates is closely related to the existence 
of complex structures. Even for the case of 2-dimensional manifolds, if the 
topological conditions on the metric are weakened, the problem is not simple 
(See: [6]). 


5.2.10 Exercise Let ds? = Edu? + 2F dudv + F dv? = v|dz + udz|?, where 
v >0, |u| <1, and z = u + iv. 

1) Substitute du and dv in terms of dz and dz. Equate coefficients to show 
that, 


1 1 
u= ago G+2Fi), v(1+ up) = zÆ + G). 


2) Solve the quadratic equation for v and thus show that, 


1 
v = (E +G+2VEG-— F?). 


3) We want ds? = \?dhdh = ?|dh|?, where h = h(z,Z). Show that h exists if, 
and only if, it satisfies the Beltrami equation 


Oh Oh 


az Bz" 


5.2.11 Example Consider the unit sphere with ds? = d8? + sin? 6 d¢?. We 
rewrite the metric as 


1 
ds” = sin? 0(—,—d6 + do’), 
sin* 0 
and set i 
du = ——d0é, dv= dọ. 
sin ð 


Integrating, we see that the transformation 
u =In|tan(§)|, v=¢. 
gives the metric manifestly in isothermal coordinates 
ds? = sin” 0(du? + dv”). (5.59) 


The example is technically not complete since the conformal factor is not in 
terms of the new variables. This is not hard to do, but the geometry will 
become more clear in the context of the sterographic projection. 


5.2. MINIMAL SURFACES 169 


5.2.4 Stereographic Projection 


Consider the unit sphere S? : x? +y?+ 22 = 1. for each point P(z, y, z) on 
the sphere, we draw a line segment from the north pole to P and extend the 
segment until it intersects the xy-plane, viewed as a copy of the complex plane, 
at point Ç = X +7Y. By simple ratio and proportions of corresponding sides 
of similar triangles, as partially shown in 5.8, we have, 


so that 


“Ty 
_ ut+iy 
C= l-z 


(5.60) 


The map m : 8? — C projects each point P except for the north pole to 
the unique complex number ¢ given by the equation above. The closer the 
point P is to the north pole, the larger the norm of ¢. This gives rise to the 
geometric interpretation that under the stereographic projection, the north pole 
correspond to the point at infinity in the complex plane. To find the inverse 
m+ of the projection, first notice that: 


Z py 
K= aa 
ey _ ty | 
C e 
a2 +y?+(1—2)? 
E (1-2) i 
_ 2— 2z 
ee 
2 
i-z 


Fig. 5.8: Stereographic Projection 


170 CHAPTER 5. GEOMETRY OF SURFACES 


Combining that last equation with 5.60, we find the x, y and z coordinates on 
the sphere, 


m*(¢) = (2,4, 2), 
a ez g-1) 
CC Ge a cc ed 
The existence of the inverse map shows that the sphere is locally diffeomorphic 
to the complex plane. If 7 represents the complex coordinate arising from the 
stereographic projection from the south pole, then the transition functions on 
the overlap of the coordinate patches is given by ¢ = 1/n and ņ = 1/¢. Thus, S? 
may be considered as a complex manifold called the Riemann sphere. In polar 


coordinates X = Rcos¢, Y = Rsin ¢, so, by ratio and proportion applied to 
similar triangles as in figure 5.8, we see that ||¢|| = R = cot($), and hence 


(5.61) 


C = cot(S)e, 
n= tan(f)e™*®. (5.62) 


The short computation that follows gives the complex metric of the Riemann 
sphere in terms of the standard metric in spherical coordinates. 


d¢ = —4 csc? (£)e’? do + icot Se"? dé, 
= 1 ; s 
d¢ = = esc”(2)e~*? do — i cot Se"? do, 
d¢dt = + esc*($) d0? + cot? (£) dé’, 
1 
= —— (dé? + sin? 6d¢”). 
4sin*($) 
Using equation 5.62 again, we find: 


1+ ¢¢ =1+4 cot? 8 = csc?(§), 


therefore, 


Lane 
= ee = d0? + sin? 0 dd”. (5.63) 
(1 +$) 
This shows that the map is conformal. Looking back at equation 5.59 for the 
metric of the sphere in isothermal coordinates given by the transformation 


u= In| tan($)|, v=, 


set a = u — iv, and 
n = e% = tan(2)e**. 
Thus, we see that the transformation is effectively the stereographic projection 


from the south pole. The complex version of the first fundamental form is called 
the Fubini-Study metric of the Riemann sphere. If we let K be the function 


K =In(1+ ¢0), (5.64) 


5.2. MINIMAL SURFACES 171 


then we can write the components g of the complex metric as 
g= 400z K. 

This formula generalizes to CP”, for which 

K — In(1 + ôijz ZÍ), 

ds? = gigdz' dz, 

Jij = zi Oz; K. 
The function K is called the Kähler potential. Let the 0 and ð be the Dol- 
beault operators that give the natural extension of exterior derivative in several 


complex variables; that is, given a differential form if a = if: be A dz? for 
multi-indices J and J, we define 


ða = hi dz! \ dz! A dz’, 
ða = hey dz* \ dz! Adz’. 


Then we obtain a natural, a non-degenerate 2-form 


w, = 100 K, 
dg Adc 
=i (5.65) 


is called the Kahler Form of the sphere S?. Some authors include a conventional 
factor of 1/2, but the factor of i is required for compatibility of the Hermitian and 
the Riemannian structure. Thus, in terms of differential geometry structures in 
dimension 2, this is as good as it gets. The two-sphere has a natural Riemannian 
structure inherited by the induced metric from R3, a complex structure induced 
by the stereographic projection, and a natural symplectic 2-form given by the 
Kahler form above. This is summarized by saying that we have the structure 
of a Kahler manifold. As we will see later, the Kahler form 5.65 is at the center 
of the construction of Dirac monopoles and instanton bundles. 

The formulas for the stereographic projection 5.60 naturally extrapolate to 
spheres in all dimensions. If the unit sphere S” € R”+! is given by equation 


(gt)? (g?)P tee eget? SI 


then the coordinates X’ of the projection onto R” from the north pole (0,0,..., 1) 
are given by 
ri 


i i 
ANS DO pi eal Ese ee) 


In the case of the unit circle S1, the quantity ¢ is a real number. Since we are 
using @ as the azimuthal angle, we have the unconventional parametrization of 


172 CHAPTER 5. GEOMETRY OF SURFACES 


1 


the unit circle x = sin, y = cos@. Then the formulas for 7~* and the metric 


read: 
2¢ C=! 2¢ 
T = T. a} y — 5 == 

Cel Cel ¢2+1 
Hence, the substitution ¢ = cot (2) transforms integrals of rational functions 
of sines and cosines into rational functions of ¢. This is the source of the now 
infamous Weierstrass substitution, which is more commonly written in terms of 
the polar angle, t = tan(). It is by this method that one obtains the integral 
5.67, and the neat formula 


do 


dc. (5.66) 


[sco d0 = In| tan()|, 


that often appear in the theory of curves and surfaces. The elegant Weierstrass 
substitution is, for the wrong reasons, no longer taught as a technique of inte- 
gration in the standard second semester calculus course. This author does not 
disagree with those calculus reformers that pushed the topic out of the curricu- 
lum at the time, but strongly disagrees with the reasons given to the effect, 
that they never again encountered this in their work. There should have been a 
geometer in the committee. The stereographic projection is the starting point 
for the entire theory of spinors. 

As an added bonus, if in equation 5.66 we pick Ç = ™ to be any rational 
number on the real line, the inverse stereographic map yields a rational point 
in the unit circle with entries giving Euclid’s formula for Pythagorean triplets 


mn n? -— m? 
n2 + MZ’ n2 + m2 


Fig. 5.9: Mercator Projection 


5.2.12 Example Mercator Projection 
Given a surface of revolution 4.7 with metric 


ds? = (1+ fP) dr? +r? dd’, 


5 taf 


3 dr? + dd?| , 


=r 
r 


5.2. MINIMAL SURFACES 173 


we set 


1 12 
di Le or, disio 
12 
p= | n, B= ¢. 


This is a conformal map into a plane with Cartesian coordinates (ĉ, ĝ) in which 
the meridians map to ĉ =constant, and the horizontal circles on the surface of 
revolution map to parallels 7 =constant. In particular, for a sphere in which 
f(r) = va? — r?, the substitution r = asin 6@ yields 


j= | seco dO = In| tan(§ + 4). (5.67) 


Comparing with equation 4.23, we see that in this projection, loxodromes on 
the sphere map to straight lines on the plane. The projection is more faithful 
near the equator as shown in figure 5.9. The map is depicted with a four-color 
scheme, which it is always possible because of a famous theorem. 


5.2.5 Minimal Surfaces by Conformal Maps 


Consider a surface {M,g} with coordinate patch x(u,v). Let ¢ = u + iv, 
and Ç = u—iv, so that dC = du +idv, d¢ = du—idv. The composite map gives 
the patch as 


x(O = (2'0), 2 GoGo), 


The complex derivative with respect to ¢ of the patch is given by: 


Let ¢ = 2x¢, and ¢” = 4(x¢, xc). Since we have chosen ¢ to depend only on ¢, 
we have Ord = 0, so @ is holomorphic. We have the following theorem: 


5.2.13 Theorem A holomorphic patch x is isothermal if and only if 
P = (PF + (67)? + (9°)? =0. (5.68) 
Proof The proof is by direct computation. 
g? = 4(dx/0¢, Ax/C), 
= (Xu — iXy, Xu — iXy), 


= (Xu; Xii) = (Xo; Xy) = 2i(Xu, Xv), 


=E-G-—2F. 


174 CHAPTER 5. GEOMETRY OF SURFACES 


The theorem follows because in a isothermal patch, E = G and F = 0, so 
¢ =0. 


5.2.14 Remark The function ¢ is a complex function, so we must be careful 
not to confuse ¢? as defined above with |¢|? = ġġ. The latter is given by 


|p |? = 4(x¢, Xe), 


= (Xu — iXy,Xu + iXy), 


= (Xu, Xu) F (xv, Xv), 


=E+G, (5.69) 


so if the patch is isothermal, E = G = 3|¢|? 

We now have a process to construct minimal surfaces. We need to find a 
holomorphic patch ¢ with ¢? = 0. To construct such patches we introduce a 
special class of complex curves. 


5.2.15 Definition A complex curve ¢: U C C — R? is called a minimal 
curve or an isotropic curve if the differential of arc length 


ds? = (dx')? + (dx*)? + (dx)? = 0. 


To find such curves we once again we take advantage of the wonderful ingenuity 
of the leading mathematicians in the 1800’s. Factoring over the complex, we 
have 


(dz! + idz?)(dx! — idx”) + (dx)? = 0, 
(dx! + idz”)(dx' — idx?) = —(dz?)’, 
da +idx? dz? 


dx3 dx! — idx?’ 


Setting the left hand side equal to some function —7 we get a pair of differential 
equations 


dx! dx? 

dz? dx? 

dx! dz? 1 
dv? © drd r’ 


If we add the two equations we get: 


dz! 1 

eae 
_ 1-7? 
ar. 

dx! dx? 


5.2. MINIMAL SURFACES 175 


Subtracting the equations instead, we get: 


ee 
dx3 T’ 
_ 14+7? 
= 
dz! dg 
maA r 


dx! dz? dz? 
1-7? i(1+7?) 2T (rjr oy) 
for some arbitrary analytic function F. At the end, we want real geometric 
objects, so we integrate and extract the real part. 


r! = rfa — T°) F(T) dr, 
r? = rf i(1+77)F(r) dr, 
g= rf 2rF(T) dr. (5.71) 


This choice of coordinates satisfies the condition ¢? = 0 for an isothermal patch. 
Although a bit redundant, this can be verified immediately. 


o=F(1—7,i(1 +17), 27) 
p =F (1-77)? +1477)? +477] =0. 


Equations 5.71 are the classical Weierstrass coordinates from which one ascer- 
tains that a holomorphic function F(T) gives rise to a minimal surface. 


Fig. 5.10: Enneper Surface 


5.2.16 Example Enneper surface 
This is the surface corresponding to a Weierstrass patch with the simplest, non- 
trivial holomorphic function, namely a constant F(T) = a which choose to be 


176 CHAPTER 5. GEOMETRY OF SURFACES 


a = 3. The surface was discovered by Enneper in 1864. Integration of 5.71 with 
F(T) = 3 gives, 


x! = R(3T — 7°) = 3u + 3w? — u’, 
x? = R [i(3r +7°)] = —3u — 3u?v + 0%, 
x 


resulting in the coordinate patch 


x(u, v) = (3u + 3uv? — u’, —3u — 3u?u + vè, 3(u? — v?)). (5.72) 
If one sets T = re? a polar coordinate parametrization is obtained 
x(r,¢) = (3r cos ġ — r? cos 3¢, —3r sin ¢ — 3r? sin 3¢, 3r? cos 2¢). (5.73) 


Figure 5.10 shows some Enneper surfaces, the first with u and v ranging from 0 
to 1.2 and the second for r € [0,2], ¢ € [1,7]. The surface is self-intersecting if 
the range is big enough. In the polar parametrization K = —(4/9)/(1+1?)~4 
and of course H = 0. Using an advanced algebraic geometry technique called 
Groébner basis, one can eliminate the parameters and show that the surface 
is algebraic, and that it can be written implicitly in terms of a ninth degree 
polynomial. As will be discussed later in equation 5.79, there is an alternate 
way to write the Weierstrass parametrization. In this alternate formulation, the 
classical Enneper surface is generated by f = 1 and g = ø. A generalization of 
the Enneper surface is obtained by taking g(a) = o”, where n+1 is the number 
of “flaps” bending up. The surface on the right of figure 5.10 corresponds to 
the case n = 3. 

Setting r =constant in the polar form of the Enneper surface, we get curves 
that are not spherical curves, but they are close. On the left of Figure 5.11 we 
display the intersection of an Enneper surface with a sphere. 


Fig. 5.11: Baseball Seam 


This raises the fun problem of finding a curve that is actually spherical and 
resembles the seam of a tennis ball or a baseball. We explore this possibility 
with a curve of the form 


x(r, 6) = (acos d — bcos 3¢, asin ¢ + bsin 3¢, c cos 2¢), 


5.2. MINIMAL SURFACES 177 


and requiring it to have a constant norm. A short computation using the sum 
and half-angle formulas for cosine, gives 


\|x||? = a? + b? — 2ab cos 4 + c* cos” 2¢, 
=a? +b? — 2abcos 4b + $ (1 + cos 49), 


=a +b +94 (S 2ab) cos 4¢. 


To make this length constant, independently of ¢, we set c? = 4ab, which 
remarkably, leads to the norm being a perfect square ||x||? = (a + b)?. A nice 
choice leading to all integer coefficients is a = 9, b = 4 which gives c = 12. The 
graph shown on the right in figure 5.11 shows that this results on a reasonable 
shape for the seam of a baseball. Personally, I find this more gratifying than 
the baseball I hand-made as a kid because I could not afford one. I lost that 
ball in the first pitch when an older boy hit a home run into a pig sty. 


5.2.17 Example Catenoid 

As we would expect, the ubiquitous catenoid can be obtained from the Weier- 
strass parametrization. Anticipating a coordinate patch with cosh functions, 
we choose F(T) = —a/T? and then let T = e” = e“*"”. Integration, followed by 
an application of the summation formulae for hyperbolic functions, we get: 


ot =—ar fU (1/72) — 1 d 


= —a R [(—1/r) — T], 
=a |e” +e "| =a [cosh w], 
=a [cosh(u + iv)], 


= a cosh u cos v. 


In a similar manner, 


r? = -a f iay) + 1] dr], 


= —a R {i[(-1/7) + 7)]}, 
= —a R {ife” — e™™]} = —a Rt sinh w], 
= —a R [isinh(u + iv)], 


= a cosh usin v. 
The last integration is a one-liner 

3 = —2a Kinz] = —2a R(w) = —2au. 
The result is the catenoid 


x(u, v) = (a cosh u cos v, a cosh u sin v, —2au). 


5.2.18 Example Henneberg Surface 


178 CHAPTER 5. GEOMETRY OF SURFACES 


Fig. 5.12: Henneberg Surface 


This surface was discovered in 1875 by the German mathematician Lebrecht 
Henneberg. The surface can be obtained from Weierstrass coordinates 5.71 by 
choosing F(T) = 1—1/r* and then letting T = e” = ett, In a manner similar 
to the computation above, we integrate the equations and follow by an applying 
the summation formulzefor hyperbolic functions. 


gt = x fia —7*)(1—1/r*)] dr, 
= rfo +1/r? -r -— 1/74)] dr, 


= R [(r = 1/7) — (1/3)(r° — 1/7), ] 

= 2R [sinh w — (1/3) sinh 3w], 

= 2R [sinh(u + iv) — (1/3) sinh 3(u + iv)], 
= 2sinh u cos v — (2/3)(sinh 3u cos 3v). 


The integral for x? gives , 


=R faa pe Lye er, 
=R fu — Afr? + 7° —1/r*)| dr, 
= R {i[(r + 1/7) + (1/3)(73 + 1/7°)]}, 
= 2R {i[cosh w + (1/3) cosh 3w]}, 


= 2R {t[cosh(u + iv) + (1/3) cosh 3(u + iv)]}, 
= —2sinh usin v — (2/3) (sinh 3u sin 3v). 


5.2. MINIMAL SURFACES 179 


The last integration is a bit simpler 
r? = r fea — 1/r*)] dr, 


= R fr — 2/19] ar, 


= R[(7? + 1/7], 

= 2 [cosh 2w], 

= 2 R [cosh 2(u + iv)], 
= 2 cosh 2u cos 2v. 


Neglecting the factor of 2 and using a common abbreviation for the hyperbolic 
functions, the result is: 


x(u, v) = (sh ucosv— $(sh 3u cos 3v), —sh usin v— (sh 3u sin 3v), ch 2u cos 2v). 


The curve given by v = 0 is a geodesic in the shape of a semicubical parabola 
which is the intersection of the surface with the plane y = 0. This is a reflection 
of the fact the a Hennerberg surface is a Björling surface, that is, a minimal 
surface that contains a given curve with prescribed normal. It turns out that 
the Hennerberg surface is a model representing an immersion in R of the 
projective plane. 

The curvature for a minimal surface in Weierstrass coordinates can be cal- 
culated from equation 4.118. Using the fact that E = G = $|¢|?, we have, 


o= F(l=7?, il +r’? DDT) 
le? = FPI -A0 — 77) + +77) +77) + 47), 
= |F|? (2 + 2177? + 477), 
= 2|F)?(1 + |[7|?). 


The conformal factor is E = A? = 4|¢|?, so the Gaussian curvature is given by 


K = -5V (In à), 

1 8 

= 45555 [In(|F|(1 + |r|), 

5p "dar +inF)+In(1 + [71] 

= ~* 2 For 2 
1 0? 2 

— 45 OTOT [In(1 | Ir] Jl; 

-dð i T j 
X aF (FID. 


E E E DD, 
a2 (+l A2 (1+ frh?) 


180 CHAPTER 5. GEOMETRY OF SURFACES 


So, the result is, 


—4 —4 


K == = ; 
PPE FP +e oy 


(5.74) 


There is an equivalent formulation of Weierstrass parametrization that also 
appears frequently in the literature. From equation 5.68, the holomorphic patch 
x is isothermal if 


(H + (#7)? + (6°)? =0. (5.75) 


Proceeding along the same lines as in the computation of isotropic coordinates, 
we have 


(¢' + i?) ($1 — ib”) = -(¢°)?, 


a = (¢°)? 
($ + 16°) = i ie) (5.76) 
Set 
= (o! — id? eee ee 5.77) 
f=% —i¢°), 9= Gi) (5. 


Here, f is holomorphic, g is meromorphic, but fg? = —(¢'+i¢7) is holomorphic. 
Since we also have fg = ¢°, we can easily solve for the components of ¢ in terms 
of f and g. The result is, 


1 


ae = 5 fl T 9°), 
P= FFL +97), 
$ = fg. (5.78) 


Since ¢ = X, — iXy, we are led to an alternative Weierstrass parametrization 


z=R | fla)g(c) do. (5.79) 


For easy reference, we denote this parametrization by the notation x = W(f, g). 
To see the relationship to the equations 5.71, consider the case when g is 
holomorphic and g~! is holomorphic. Then, can use g as the complex vari- 
able, and we set g = 7, so that dg = dr. Let F = 1 f/ 4 = al ae Then 
F(r)dr = $f(o)do and we have recovered equation 5.71. Choosing minus the 
imaginary parts in equation 5.79 yields conjugate minimal surfaces, given by 
the conjugate harmonic patch y as defined by equation 5.54. 

A remarkable result can be obtained as follows. As a complex variable, g 
can be mapped into the unit sphere S$? by the inverse stereographic projection 


5.2. MINIMAL SURFACES 181 


(See equation 5.61) 


J, } g k g 1 

gegm — 9) 99 — 1) (5.80) 
(gg + 1) 

On the other hand, the real and imaginary parts of ¢ = x, — ix, comprise two 

orthogonal tangent vectors to the surface. We compute the dot product 


Gy 


($, 77") = Te ae 9’) +7) + (1+9) - 7) + allo? — 1), 
= Sue D Kg +3) + (9-79) -9°(9 +7- g +7) + 2glg|? — 29], 
= TAT D [29 — 2979 + 2glg|? — 2g], 


= 0. 


We conclude that 7~'g is a unit normal to the surface, and thus we have the 
following theorem, 


5.2.19 Theorem If x(¢,¢) is an isothermal holomorphic patch with ¢ = 
2x¢ = X, — ix,, then the function g = ¢°/(¢' — id?) in the Weierstrass 
parametrization W(f,g) of a minimal surface, is the stereographic projection 7 
of the Gauss map. That is, 

TON =g, (5.81) 


where as usual, N = (xX, X Xy)/||Ku X xoll. 


Rewriting the expression for the Gaussian curvature 5.74 in terms of f and g, 
we get, 
4\g'| i 
K=- > ; (5.82) 
[fl + lgl?) 
We see immediately that the one-parameter family of associated patches z+ 
given by 
z; = Re"), (5.83) 
results on a family of isometric minimal surfaces. This provides another way to 


view the deformation of a catenoid into a helicoid described earlier in equation 
4.116. 


5.2.20 Example Bour surface 

The surface W(f, g), with f = 1 and g = y/o was first discussed in 1861 by Bour, 
who subsequently was awarded the mathematics prize of the French Academy 
of Sciences. The integrals are completely elementary, 


aye f (1—0) do = 4R [z — 42°], 


o (l+o)d = 4R [z + 42°], 


3 2R [8/3], 


182 CHAPTER 5. GEOMETRY OF SURFACES 


b) Trinoid 


Fig. 5.13: Bour and Trinoid Surfaces 


Converting to polar form, we get the parametrization. 
x(r, ) = (Arcos ¢ — tr? cos2¢, —4r sind + Ir? sin 24, 2r°/? cos(2¢).) (5.84) 
The surface in rendered in figure 5.13a, with r € [0, 4]. 


5.2.21 Example Trinoid surface 
k-1 


This surface is part of family of surfaces indexed by f = Gray g = 0o}, 
where k = 2,3.... These remarkable surfaces have neat topologies discovered 
by Jorge and Meeks in 1983. The binoid case k = 2 is just a catenoid. The 
trinoid corresponds to k = 3. The integrals for the trinoid yield the following 


parametrization 


1 2 z 1 2 

= + — ] H H1 
x gl g Inte ) (+241) 19 n(z°+2+1)], 
eit 1 42 iia: 
g = 97m i Wee 2V/3 tan” 1( g(2z+1))]. 
3 1 
= eles], 


where z = u + iv. The algebra involved in finding the real coordinate patch 
is messy, so the task is best left to a computer algebra system. Converting to 
polar coordinates by letting z = re’? improves rendering the surface plot. As 
shown in the portion of the surface figure 5.13b, the surface has three openings, 
meaning that the Gauss map misses three points on 8°. In the same manner, 
the k-noid is topologically equivalent to a sphere with k points removed. 


For convenient reference, we include the following table showing the choices 
of f and g yielding the listed minimal surfaces. 


5.2. MINIMAL SURFACES 183 


f g Surface Author, Date 
—et/2,t=0 e-? | Catenoid Euler, 1740 

—e**/?, t=7/2| e77 | Helicoid Meusnier, 1770 

2/(1 — 07) o Scherk Scherk, 1834 

1 vo | Bour Bour, 1861 

1 o Enneper Enneper, 1863 

2(1 — 07$) oO Henneberg | Henneberg, 1875 

(o —1)-? o? Trinoid Jorge and Meeks, 1983 
Q A/o | Costa Costa, 1996 


If I had to pick a minimal surface to represent the “orchid” which is the 
national flower of my native country, I would pick the Costa surface. This 
surface was not discovered until 1982, triggering a renewed interest on minimal 
surfaces with non-trivial topologies. A parametrization was not produced until 
1996. The integrals involve Weierstrass elliptic functions, which probably would 
have been most pleasing to Weierstrass, but requires more advanced knowledge 
than is typically covered on introductory courses on complex variables. The 
Weierstrass elliptic function p(z) is implemented in Maple as, 


1 1 1 
WeierstrassP (z, 92, 93) = — 4 2 5 | (5.85) 
z zZ 


909 
800 
Tod 
so 
40 
29 
on 
o 1 2 


Fig. 5.14: Weierstrass Elliptic Function 


The functions p are meromorphic with a pole of order 2 at the origin. They 
are doubly periodic over an arbitrary lattice {w = 2mw, + 2nw2| m,n € Z} 
with periods w; and w2. The quantities g2 and g2 are called the invariants and 
are related to w; and w2. In the Maple implementation, the periods are set to 
Wy = w2 = $, which does not result in any real loss of generality. A salient 
property of the invariants is that they link g to the cubic differential equation 


(9')* = 4()° — gap — 93. (5.86) 


Costa’s surface is generated by setting f = p and g = A/g’. 

Following Alfred Grey, as shown in Eric Weisstein’s Wolfram Mathworld 
[40], we pick g2 = 189.07272, g3 = 0 and A = \/2mqgp. As we have done before, 
we first try simply inserting the choices of f and g into W(f,g), integrating 
with maple with the substitution z = u + iv, and then extracting the real part. 
The result is a bit disconcerting as one obtains expressions with Weierstrass ¢ 


184 CHAPTER 5. GEOMETRY OF SURFACES 


Fig. 5.15: Costa’s Minimal Surface 


function with very large coefficients of the order of 10°. The problem becomes 
immediately clear by observing a plot of the real part go as shown in figure 
5.14. The W(f,g) coordinates are integrated by default from 0 to z, so the 
problem is caused by the pole of order 2 at the origin. Still, one may persist 
by proceeding to plot the surface by restricting the values of u and v to stay 
away from the singularities. We chose the ranges to go from 0.02 to 0.98. 
Surprisingly, Maple takes some time to compute, but it renders the beautiful 
flowery-shaped curve that appears at the left on figure 5.15. While aesthetically 
pleasing, the figure is topologically inaccurate. A much more rigorous analysis 
such as done by A. Grey in deriving the coordinates quoted at the Mathworld 
site, lead to the widely diffused picture of Costa’s surface that appears on the 
right. Costa’s surface is topologically equivalent to a torus (genus=1) with 
three points removed. 


Chapter 6 


Riemannian Geometry 


6.1 Riemannian Manifolds 


In the definition of manifolds introduced in section 4.1, it was implicitly as- 
sumed manifolds were embedded (or immersed) in R”. As such, they inherited 
a natural metric induced by the standard Euclidean metric of R”, as shown in 
section 4.2. For general manifolds it is more natural to start with a topological 
space M, and define the coordinate patches as pairs {U;,¢;}, where {U;} is an 
open cover of M with local homeomorphisms 


o@,:U,;C MR”. 


If p € U; N U} is a point in the non-empty intersection of two charts, we require 
that the overlap map $;; = bib; * : R” > R” be a diffeomorphism. The local 
coordinates on patch {U,¢} are given by (x!,...,2"), where 


r = utog, 


and ut : R” — R are the projection maps on each slot. The concept is the same 
as in figure 4.2, but, as stated, we are not assuming a priori that M is embedded 
(or immersed) in Euclidean space. If in addition the space is equipped with a 
metric, the space is called a Riemannian manifold. If the signature of the metric 
is of type g = diag(1, 1,...,—1,—1), with p ‘+’ entries and q ‘-’ entries, we say 
that M is a pseudo-Riemannian manifold of type (p,q). As we have done with 
Minkowski’s space, we switch to Greek indices x” for local coordinates of curved 
space-times. We write the Riemannian metric as 


ds? = gy, dz”da”. (6.1) 
We will continue to be consistent with earlier notation and denote the tangent 
space at a point p E€ M as T,,M, the tangent bundle as TM, and the space of 
vector fields as 2°(M). Similarly, we denote the space of differential k-forms 
by Q*(M), and the set of type (7) tensor fields by Zy (M). 


185 


186 CHAPTER 6. RIEMANNIAN GEOMETRY 


bi = pr $j 
[i | E 


R” R” 


Fig. 6.1: Coordinate Charts 


Product Manifolds 


Suppose that Mı and Mə are differentiable manifolds of dimensions m and 
mə respectively. Then, Mı x Mə can be given a natural manifold structure of 
dimension n = mı + mz induced by the product of coordinate charts. That is, 
if (i, , Ui, ) is a chart in Mı in a neighborhood of p, € Mı, and (4;,,Ui,) is a 
chart in a neighborhood of p, € M2 in M2, then the map 


Pix Qi, ; U; x Ui, > R” 


defined by 
(Qi X Pi, )(Pa > Po) = (bi (pP), Pi{P2)) 


is a coordinate chart in the product manifold. An atlas constructed from such 
charts, gives the differentiable structure. Clearly, Mı x Mg is locally diffeomor- 
phic to R™ x R™. To discuss the tangent space of a product manifold, we 
recall from linear algebra, that given two vector spaces V and W, the direct 
sum V p W is the vector space consisting of the set of ordered pairs 


VeW={(v,w):vEV, weWy, 
together with the vector operations 


(v1, W1) + (v2, w2) = (v1 + v2, w1 + w2), for all, v1,v2 E V; wi, we E€ W, 
k(v,w) = (kv, kw), forall ke R 


People often say that one cannot add apples and peaches, but this is not a 
problem for mathematicians. For example, 3 apples and 2 peaches plus 4 apples 
and 6 peaches is 7 apples and 8 peaches. This is the basic idea behind the direct 
sum. We now have the following theorem: 


6.1.1 Theorem Let (p,,p,) E€ Mi x M2), then there is a vector space iso- 
morphism 
Ty (Mı x Mz) = Tp, Mı © Tp, M2. 


P1 P2 


6.1. RIEMANNIAN MANIFOLDS 187 
Proof The proof is adapted from [18]. Given X, € Tp, Mı and X2 € Tp, Mo, 
let 


xı(t) be a curve with xı(0) = p, and z1 (0) = X4, 
x(t) be a curve with x2(0) = p, and z4 (0) = Xo. 


Then, we can associate 
(Xı, X2) E€ Tp, Mı Tp, Mo 


with the vector X € To, p) (Mı x M2), which is tangent to the curve x(t) = 
(xı(t),£2(t)), at the point (p,,p,). In the simplest possible case where the 
product manifold is R? = R! x R', the vector X would be the velocity vector 
X = x'(t) of the curve x(t). It is convenient to introduce the inclusion maps 


tp, Mı 
(Mı x Mə) 
ip, : M2 
defined by, 


ip (p) = (p, pa), for p € Mi, 
ip (q) = (p,q), for qE M2 


The image of the vectors X; and Xə under the push-forward of the inclusion 
maps 


Top, po) (Mı x M2) 
tp, x : Tp, M2, 


yield vectors X, and Xə, given by, 


Then, it is easy to show that, 
X = ip,» (X1) + tp, «(X2). 


Indeed, if f is a smooth function f : Mı x M2 > R, we have, 


d 
= GF EH, 22(0)) Loo + Ge (F(21(0), ea(é))I.—05 


188 CHAPTER 6. RIEMANNIAN GEOMETRY 


More generally, if 
9: Mı x Mo >N 


is a smooth manifold mapping, then we have a type of product rule formula for 
the Jacobian map, 


PX = Px (tp, «(X1)) + Px (ip, x(X2)), 
= (po tp, )xX1 + (Po tp, Ja X2 (6.2) 


This formula will be useful in the treatment of principal fiber bundles, in which 
case we have a bundle space FE, and a Lie group G acting on the right by a 
product manifold map uw: E x G > E. 


6.2 Submanifolds 


A Riemannian submanifold is a subset of a Riemannian manifold that is 
also Riemannian. The most natural example is a hypersurface in R”. If 
(xt, x?... x”) are local coordinates in R” with the standard metric, and the 
surface M is defined locally by functions zê = zt(u®), then M together with 
the induced first fundamental form 4.12, has a canonical Riemannian structure. 
We will continue to use the notation V for a connection in the ambient space 
and V for the connection on the surface induced by the tangential component 


of the covariant derivative 
VxY =VxY+H(X,Y), (6.3) 


where H(X,Y) is the component in the normal space. In the case of a hyper- 
surface, we have the classical Gauss equation 4.74 


VxY =VxY +II(X,Y)N (6.4) 
=VxY+< LX,Y >N, (6.5) 


where LX = —V xN is the Weingarten map. If M is a submanifold of codi- 
mension n — k, then there are k normal vectors Nz, and k classical second 
fundamental forms [[,(X,Y), so that H(X, Y) = X, I, (X,Y) Ng. 

As shown by the theorema egregium, the curvature of a surface in R? depends 
only on the first fundamental form, so the definition of Gaussian curvature as 
the determinant of the second fundamental form does not even make sense in- 
trinsically. One could redefine K by Cartan’s second structure equation as it 
was used to compute curvatures in Chapter 4, but what we need is a more gen- 
eral definition of curvature that is applicable to any Riemannian manifold. The 
concept leading to the equations of the theorema egregium involved calculation 
of the difference of second derivatives of tangent vectors. At the risk of being 
somewhat misleading, figure 4.95 illustrates the concept. In this figure, the vec- 
tor field X consists of unit vectors tangent to parallels on the sphere, and the 
vector field Y are unit tangents to meridians. If an arbitrary tangent vector Z 
is parallel-transported from one point on an spherical triangle to the diagonally 
opposed point, the result depends on the path taken. Parallel transport of Z 


6.2. SUBMANIFOLDS 189 


Fig. 6.2: R(X,Y)Z 


along X followed by Y, would yield a different outcome that parallel transport 
along Y followed by parallel transport along X. The failure of the covariant 
derivatives to commute is a reflection of the existence of curvature. Clearly, the 
analogous parallel transport by two different paths on a rectangle in R” yield 
the same result. This fact is the reason why in elementary calculus, vectors are 
defined as quantities that depend only on direction and length. As indicated, 
the picture is misleading, because, covariant derivatives, as is the case with 
any other type of derivative, involve comparing the change of a vector under 
infinitesimal parallel transport. The failure of a vector to return to itself when 
parallel-transported along a closed path is measured by an entity related to the 
curvature called the holonomy of the connection. Still, the figure should help 
motivate the definition that follows. 


6.2.1 Definition On a Riemannian manifold with connection V, the curva- 
ture R and the torsion T are defined by: 


R(X, Y) =VxVy — Vy Vx — Vixy); (6.6) 
T(X,Y) =VxY —VyX — [X,Y]. (6.7) 


6.2.2 Theorem The Curvature R is a tensor. At each point p € M, R(X,Y) 
assigns to each pair of tangent vectors, a linear transformation from T ,M into 
itself. 

Proof Let X,Y,Z € 2(M) be vector fields on M. We need to establish that 


R is muiltilinear. Since clearly R(X,Y) = —R(Y,X), we only need to establish 
linearity on two slots. Let f be a C% function. Then, 


R(X, YE Vie vel VyVYfxZ = Vie 
= fVxVyZ— Vy (fVx)Z—Vigxy—-y fx) Z 
= fVxVyZ-Y(f)Vx)Z— fVv¥VxZ—-—VypxvZ+Viyvpyxepyx)% 
=f{VxVyZ—-Y(f)Vx)Z— fVv¥VxZ— fVxvZtVy(pyxZt+VivxZ, 
= fVxVyZ—-Y(f)Vx)Z — fVvVxZ— fVxvZ+V(f)VxZ + fVyxZ, 
= fVxVyZ— fVyVxZ - f(VxyZ — Vyx)Z, 
= fR(X,Y)Z. 


190 CHAPTER 6. RIEMANNIAN GEOMETRY 


Similarly, recalling that [X,Y] € X, we get: 
R(X, Y) Z) = VxVy(fZ) - Vy Vx(fZ) — Vix,y\(F2), 
= Vx(Y(f)Z) + [Vy Z) — Vy (X(f)2 + fV x2) - [X.Y](F)Z) - Vix, 
=XY(f)Z)+Y(f)VxZ4+X(f)VyZ4+ fVxVyZ— 
YX(f)Z) —X(f)VyZ—-Y(f)VxZ — fVyVxZ— 
LX, Y](f)Z) — fV x,y (Z2), 
= fR(X,Y)Z. 
We leave it as an almost trivial exercise to check linearity over addition in all 
slots. 


6.2.3 Theorem The torsion T is also a tensor. 
Proof Since T(X,Y) = —T(Y,X), it suffices to prove linearity on one slot. 
Thus, 


T(fX,Y) = VfXY — Vy (fX) — [fX,Y], 
= fVxY —Y(f)X — fVy X — fXY +Y (fX), 
= fVxY —Y(f)X — fVy X —fXV +Y (f)X + fYX, 
= fVx¥ — fVYX - f[X,Y], 
= fT(X,Y). 


Again, linearity over sums is clear. 


6.2.4 Theorem In a Riemannian manifold there exist a unique torsion free 
connection called the Levi-Civita connection, that is compatible with the metric. 
That is: 


[X,Y] =VxY - VyX, (6.8) 
Vx <YZ>=<VxY,Z>4+<Y,VxZ>. (6.9) 


Proof The proof parallels the computation leading to equation 4.76.Let V be 
a connection compatible with the metric. By taking the three cyclic derivatives 
of the inner product, and subtracting the third from the sum of the first two 
(a) Vx <Y,Z>=<VxY,Z>4+<Y,VxZ>, 
(b) Vy < X, Z >= < VyX,Z > +< X,VyZ >, 
(c) Vz< X,Y >=<VzZX,Y >+<X,VzY >, 
(a) + (b) — (c) =< VxY,Z >+ < VyX,Z > +< [X, Z], Y >+ <[Y, Z], X > 
=2 < VxY,Z > + < [Y, X], Z > + < [X, Z], Y >+ < [Y, Z], X > 


Therefore: 


< VxY, Z >=3{Vx < Y,Z >+Vy < X, Z > -Vz < X,Y > 
+< [X,Y], Z >+ < [Z, X], Y > +< [Z,Y], X >}. (6.10) 


The bracket of any two vector fields is a vector field, so the connection is unique 
since it is completely determined by the metric. In disguise, this is the formula 
in local coordinates for the Christoffel symbols 4.76. This follows immediately 


6.2. SUBMANIFOLDS 191 


by choosing X = 0/0x%, Y = 0/0x° and Z = 0/0x7. Conversely, if one 
defines Vx Y by equation 6.10, a long but straightforward computation with 
lots of cancellations, shows that this defines a connection compatible with the 
metric. 


As before, if {ea} is a frame with dual frame {0°}, we define the connection 
forms w, Christoffel symbols I and torsion components in the frame by 


Vxeg =w"(X) ey, (6.11) 
Veneg =Thg €y, (6.12) 
T(€a, es) = Tag ev- (6.13) 


As was pointed out in the previous chapter, if the frame is an orthonormal 
frame such as the coordinate frame {0/0x"} for which the bracket is zero, then 
T = 0 implies that the Christoffel symbols are symmetric in the lower indices. 


pi — TY T as 
T% g =T] ek =0. 


For such a coordinate frame, we can compute the components of the Riemann 
tensors as follows: 
R(e4,eg) e5 = Ve, Vesta — Veg Vees, 
= Ve, (Pgsea) — Veg (Ty sea), 
= 135,70 +I 8l hatu — 95, ¢€a — Wor haen» 
= [Lay = T Bys + Palsy — Ve TSuleo, 


= R’gys Cas 
where the components of the Riemann Tensor are defined by: 
R375 = T88, — T87, + Teel yy — Thal Su (6.14) 


Let X = X¥e, be and a = X, 0” be a covariant and a contravariant vector 
field respectively. Using the notation Va = Ve, it is almost trivial to compute 
the covariant derivatives. The results are, 
Ve X = (X's + LP lei. 
Vga = (Xup — XT” ay)", (6.15) 
We show the details of the first computation, and leave the second one as an 
easy exercise 
V6 X =Ve(X"e,), (6.16) 
= Xhe, + XT} e5, (6.17) 
= (X + XT” gs lei: (6.18) 


192 CHAPTER 6. RIEMANNIAN GEOMETRY 


In classical notation, the covariant derivatives X"), and X,,\ 
terms of the tensor components, 


g are given in 
xX" ig = X + X T” gy, 
Xule = Xp,e — XI” bp- (6.19) 


It is also straightforward to establish the Ricci identities 


X" llap Pa. X" Ba = X Re eB, 
X ulag — Xalba = —XvR" pap- (6.20) 


Again, we show the computation for the first identity and leave the second as 
a exercise. We take the second derivative. and then reverse the order, 


VaVeX = ValX gen + XT" pven), 
= X% eu + XT eyes +X” Th ev + X Th, aen + X TK Tires, 
VaV oX =(X%, + X04, + XT ey +XTh ya +X TSE seu, 


VeVaX = (XH p + X05, + XGL + XTE, g + XDSL Ss )en- 


Subtracting the last two equations, only the last two terms of each survive, and 
we get the desired result, 


2V a Vp (X) = X” Chua — Thng + Tiha — TET es eus 


2V ia Vp (X"e,) = CX" R” eg Cie 


In the lietrature, many authors use the notation Vg X” to denote the covariant 
derivative X” g, but it is really an (excusable) abuse of notation that arises 
from thinking of tensors as the components of the tensors. The Ricci identities 
are the basis for the notion of holonomy, namely, the simple interpretation that 
the failure of parallel transport to commute along the edges of an rectangle, 
indicates the presence of curvature. With more effort with repeated use of 
Leibnitz rule, one can establish more elaborate Ricci identities for higher order 
tensors. If one assumes zero torsion, the Ricci identities of higher order tensors 
just involve more terms with the curvature. It the torsion is not zero, there are 
additional terms involving the torsion tensor; in this case it is perhaps a bit 
more elegant to use the covariant differential introduced in the next section, so 
we will postpone the computation until then. 

The generalization of the theorema egregium to manifolds comes from the 


same principle of splitting the curvature tensor of the ambient space into the 
tangential on normal components. In the case of a hypersurface with normal 


6.2. SUBMANIFOLDS 193 


N and tangent vectors X, Y, Z, we have: 


R(X, Y)Z=VxVyZ-VyvVxZ-Vix.y)Z, 

=Vx(VyZ+ < LY,Z>N)-Vy(VxZ4+ < LX,Z>N)-Vx,yZ, 
VxV¥Z4+ <LX,VyZ>N+X<LY,Z>N+<LY,Z>LX— 
Vy VxZ- <LY,VyZ>N-Y <LX,Z>N-—<LX,Z>LY- 
Vix y]Z- < L([X,Y]),Z>N 
VxV¥Z4+ <LX,VyZ>N+X<LY,Z>N+<LY,Z>LX— 
Vy VxZ-— <LY,VyZ>N-Y <LX,Z>N-—<LX,Z>LY-— 
Vix,y]Z— < L([X, Y]),Z>N, 

=VxVyZ4+ < LX,Vy¥Z>N+<VxLY,Z>N+<LY,VxZ>N+<LY,Z>LX— 
Vy VxZ—- < LY,VyZ>N-<VyLX,Z>N- < LX,VyZ > N- < LX,Z > LY— 
Vix y]Z- < L([X,Y]), Z >N, 

= R(X, Y)Z+ < LY,Z > LX- < LX,Z > LY + 
{< VxLY,Z >— < VyLX,Z >- < L([X,Y]), Z >}N. 


If the ambient space is R”, the curvature tensor R is zero, so we can set the 
horizontal and normal components in the right to zero. Noting that the normal 
component is zero for all Z, we get: 


R(X,Y)Z+ < LY,Z > LX- < LX,Z > LY =0, (6.21) 
VxLY — VyLX — L([X,Y]) = 0. (6.22) 
In particular, if n = 3, and at each point in the surface, the vectors X and Y 


constitute an a basis of the tangent space, we get the coordinate-free theorema 
egregium 


K =< R(X,Y)X,Y >=< LX,X >< LY,Y >-— < LY, X >< LX,Y >= det(L). 
(6.23) 
The expression 6.22 is the coordinate-independent version of the equation of 
Codazzi. 


We expect the covariant definition of the torsion and curvature tensors to be 
consistent with the formalism of Cartan. 


6.2.5 Theorem Equations of Structure. 


O°% = d9® + w° A 0P, (6.24) 
Qg = dwg +w% Awg. (6.25) 


To verify this is the case, we define: 


T(X,Y) = O°(X,Y Jea, (6.26) 
R(X, Y Jeg = Q 5(X,Y ea. (6.27) 


Recalling that any tangent vector X can be expressed in terms of the basis as 


194 CHAPTER 6. RIEMANNIAN GEOMETRY 


X = 0°(X) ea, we can carry out a straight-forward computation: 
@°(X,Y) ea = T(X,Y) = VxY — Vy X - [X,Y], 
= Vx (0% (Y )) ea — Vy (0° (X)) ea — 0° ([X, Y]) ea, 
= X (0° (Y)) ea + 0° (Y) fa (X) eg — Y (0° (X)) ea 
— 0°(X) ofa (Y) es — 0° ([X,Y]) ea, 
= {X (0° (Y)) — Y (6° (X)) — 0% ([X, Y]) + ws (X) (0° (Y) — ws (Y )(0" (X )}ea, 
= {(d0* + w% A 0°)(X,Y)}ea, 


where we have introduced a coordinate-free definition of the differential of the 
one form 0 by 


d0(X,Y) = X(0(Y))— y(0(X)) — 0([X,Y]). (6.28) 


It is easy to verify that this definition of the differential of a one form satisfies 
all the required properties of the exterior derivative, and that it is consistent 
with the coordinate version of the differential introduced in Chapter 2. We 
conclude that 

O° = d0% +w% A 0P, (6.29) 


which is indeed the first Cartan equation of structure. Proceeding along the 
same lines, we compute: 
0% (X, Y) ea = VxVy ep — VyVx eg — Vix,Y] eB, 
= Vx (w%g(¥) ea) — Vy (w%g(X) ea) — w%9([X, ¥]) ea, 
= X(w%5(Y)) ea + w° (Y) wa (X) ey — ¥(w%g(X)) ea 
— w° (X) wa (¥) ey — w"e([X, Y]) ea 

= {(dw%, +w% A w,)(X,Y) fea, 

thus arriving at the second equation of structure 


2%, = dwg +w% Aw". (6.30) 


The quantities connection and curvature forms are matrix-valued. Using matrix 
multiplication notation, we can abbreviate the equations of structure as 


O =d +w^9, 
Q = dw +w ^w. (6.31) 


Taking the exterior derivative of the structure equations gives some interesting 
results. Here is the first computation, 


dO = dw A0 — w ^ d9, 
=dw^0-wA^(O@©-—-w^8), 
= dw ^0—w^0+wA^w0, 
=(dwtwAw)AI-—wAO, 
=QAI-—wWAO, 


6.2. SUBMANIFOLDS 195 


SO, 
dO +whO@=QN8. (6.32) 


Similarly, taking d of the second structure equation we get, 


dQ = dw Aw +w Adu, 
=(QN-wAw)AwtwA(Q-wAw). 


Hence, 
dQ=QAw-wAQ. (6.33) 


Equations 6.32 and 6.33 are called the first and second Bianchi identities. The 
relationship between the torsion and Riemann tensor components with the cor- 
responding differential forms are given by 


OF = iT% 50 A0’, 
95 = Rg 0 AO”, (6.34) 


In the case of a non-coordinate frame in which the Lie bracket of frame vectors 
does not vanish, we first write them as linear combinations of the frame 


[eg, ey] = CB, €a. (6.35) 
The components of the torsion and Riemann tensors are then given by 
T'by =U By — Tye — Chy 
R45 = T88, y — T87, + Thal gu — roy Se O = pC aes: (6.36) 


The Riemann tensor for a torsion-free connection has the following symmetries; 


R(X, Y) = —R(Y, X), 
< R(X, Y)Z,W > =- < R(X,Y)W,Z >, 
R(X,Y)Z + R(Z,X)Y + R(Y,Z)X =0. (6.37) 


In terms of components, the Riemann Tensor symmetries can be expressed as 


Rapys = = Raz = —Regay6, 
Rags = Ry sap; 
Rapys T Raysp T Raspa =0. (6.38) 


The last cyclic equation is the tensor version of the first Bianchi Identity with 
0 torsion. It follows immediately from setting Q A 0 = 0 and taking a cyclic 
permutation of the antisymmetric indices {8, y, ô} of the Riemann tensor. The 
symmetries reduce the number of independent components in an n-dimensional 
manifold from nf to n?(n? — 1)/12. Thus, for a 4-dimensional space, there are 
at most 20 independent components. The derivation of the tensor version of 


196 CHAPTER 6. RIEMANNIAN GEOMETRY 


the second Bianchi identity from the elegant differential forms version, takes a 
bit more effort. In components the formula 


dQ =QAw-wAQD 
reads, 
Re grin OP A O* AO = (Eug R? pra — IF po RP aero A 0% AO, 
where we used the notation, 
VaR? gra = RY prap: 


Taking a cyclic permutation on the antisymmetric indices «k, A, u, and using 
some index gymnastics to show that the right hand becomes zero, the tensor 
version of the second Bianchi identity for zero torsion becomes 


R” gira] =0 (6.39) 


6.3 Sectional Curvature 


Let {M,g} be a Riemannian manifold with Levi-Civita connection V and 
curvature tensor R(X, Y). In local coordinates at a point p € M we can express 
the components 

R = Ryvpo dx” dx” dx? dx” 


of a covariant tensor of rank 4. With this in mind, we define a multilinear 
function 
R:T,(M) ®T,(M) ® T (M) ®T,(M) > R, 


by 
R(W,Y, X, Z) =< W,R(X,Y)Z > (6.40) 


In this notation, the symmetries of the tensor take the form, 


R(W, X,Y, Z) = —R(W,Y,X, Z), 
R(W, X,Y, Z) + R(W, Z, X,Y) + R(X,Y, Z, X) = 0. (6.41) 


From the metric, we can also define a multilinear function 
G(W,Y, X, Z) =< Z,Y >< X,W >- < Z,X >< Y,W >. 


Now, consider any 2-dimensional plane Vp C T,(M) and let X,Y € V be 
linearly independent. Then, 


G(X, Y, X,Y) =< X, X >< Y,Y >- < X,Y >? 


6.3. SECTIONAL CURVATURE 197 


is a bilinear form that represents the area of the parallelogram spanned by X 
and Y. If we perform a linear, non-singular change of of coordinates, 


X'’=aX+bY, y =cX4+dY, ad—bcF0, 


then both, G(X,Y.X,Y) and R(X,Y,X,Y) transform by the square of the 
determinant D = ad — bc, so the ratio is independent of the choice of vectors. 
We define the sectional curvature of the subspace V, by 


R(X,Y, X,Y) 
G(X,Y, X,Y)” 
R(X,Y,X,Y) 
ZX, X><Y,Y >- <X,Y > 


K (Vp) = 


(6.42) 


The set of values of the sectional curvatures for all planes at T, (M) completely 
determines the Riemannian curvature at p. For a surface in R? the sectional 
curvature is the Gaussian curvature, and the formula is equivalent to the the- 
orema egregium. If K(V,) is constant for all planes V, € T,(M) and for all 
points p E€ M, we say that M is a space of constant curvature. For a space of 
constant curvature k, we have 


R(X,Y)Z=k(< Z,Y > X- < Z,X >Y) (6.43) 
In local coordinates, the equation gives 


Ryvpo = k(QuoGup T JvyIno). (6.44) 


6.3.1 Example 
The model space of manifolds of constant curvature is a quadric hypersurface 
M of R”+! with metric 


ds? = ek?°dt? + (dyt)? +---+dy”)’, 
given by the equation 


M : ek? + (y +...(y")? = ck?, t0, 


where k is a constant and « = +1. For the purposes of this example it will 
actually be simpler to completely abandon the summation convention. Thus, 
we write the quadric as 

ekt + Ya (y")? = ek’. 


If k = 0, the space is flat. If e = 1, let (y°)? = kt? and the quadric is isometric 
to a sphere of constant curvature 1/k?. If e = —1, (xt)? = —k?(1— t?) > 0, 
then t? < 1 and the surface is a hyperboloid of two sheets. Consider the 
mapping from (R)"*1 to R” given by 


a’ = y'/t. 


198 CHAPTER 6. RIEMANNIAN GEOMETRY 


We would like to compute the induced metric on the the surface. We have 
— kt + Ely)? = — kt 4 Pli? = —k2 
so . 
ae ee 
—k? + X(x) 
Taking the differential, we get 


—k?¥;(a'da") 


t dt = —.——_—.. 
=k? + X; (xi)? 


Squaring and dividing by t? we also have 


dt? = TAn 
(—k? + X; (a*)2)3 


From the product rule, we have dy’ = x‘ dt + t dxt, so the metric is 


ds? = —k? dt? + [E (2t)? ldt? + 2t dt D,(a' dz*) + ?E(dz")?, 
= [—k? + ¥,(2*)?] dt? + 2t dt Z(t dz’) + ?X(dzx)?, 
_ k’? Eilr drt)? i 2k? [X (xidzt)]? —k?X;(dzt)? 
k FE [Ek + Ai | —k2 F Ddr 
[k? — X(x)? Xi (dr)? — (Zi (a" de’)? 
[k2 — D,(x*)2]2 


= k? 


It is not obvious, but in fact, the space is also of constant curvature (—1/k?). 
For an elegant proof, see [18]. When n = 4 and € = —1, the group leaving the 
metric 

ds? = —k°dt? + (dy')? + (dy”)? + (dy®)? + (dy*)? 
invariant, is the Lorentz group O(1, 4). With a minor modification of the above, 
consider the quadric 


M : -KRPP + (yP +... (Y = Rk. 


In this case, the quadric is a hyperboloid of one sheet, and the submanifold 
with the induced metric is called the de Sitter space. The isotropy subgroup 
that leaves (1,0,0,0,0) fixed is O(1,3) and the manifold is diffeomorphic to 
O(1,4)/O(1,3). Many alternative forms of the de Sitter metric exist in the 
literature. One that is particularly appealing is obtained as follows. Write the 
metric in ambient space as 


ds? = —(dy®)? + (dy')? + (dy?)? + (dy®)? + (dy? 


with the quadric given by 


6.4. BIG D 199 


Let 


so M represents a unit sphere S3. Introduce the coordinates for M 


y? = ksinh(r/k), 
yt = kcosh(r/k). 


Then, we have 


dy? = cosh(r/k) dr, 
dy’ = sinh(r/k)x* dr + kcosh(r/k) da’. 


The induced metric on M becomes, 


ds? = —|cosh?(7/k) — sinh? (T /k)&,(x")?] dr + cosh? (7 /k);(da")?, 
= —dr? + cosh? (r/k)dQ?, 


where dQ. is the volume form for S°. The most natural coordinates for the 
volume form are the Euler angles and Cayley-Klein parameters. The interpre- 
tation of this space-time is that we have a spatial 3-sphere which propagates 
in time by shrinking to a minimum radius at the throat of the hyperboloid, 
followed by an expansion. Being a space of constant curvature, the Ricci tensor 
is proportional to the metric, so this is an Einstein manifold. 


6.4 Big D 


In this section we discuss the notion of a connection on a vector bundle E. 
Let M be a smooth manifold and as usual we denote by T? (p) the vector space 
of type (") tensors at a point p € M. The formalism applies to any vector 
bundle, but in this section we are primarily concerned with the case where E 
is the tensor bundle Æ = T} (M). Sections T(E) = Z(M) of this bundle 
are called tensor fields on M. For general vector bundles, we use the notation 
s € T(E) for the sections of the bundle. The section that maps every point of M 
to the zero vector, is called the zero section. Let {€a} be an orthonormal frame 
with dual forms {0°}. We define the space QP (M, E) tensor-valued p-form as 
sections of the bundle, 


(M, E) = T(E 9 A?(M)). (6.45) 


T ) with 


As in equation 2.63, a tensor-valued p form is a tensor of type (, ip 


components, 


T = TE To ypa D Ea DOB OAOA... AOT. (6.46) 


200 CHAPTER 6. RIEMANNIAN GEOMETRY 


A tensor-valued 0-form is just a regular tensor field T € 7,7(M) The main 
examples of tensor-valued forms are the torsion and the curvature forms 


O = O° Q ea, 
N = 2% 8 ea QOP. (6.47) 
The tensorial components of the torsion tensor, would then be written as 
T = T” ge, BO Q07, 
= T CIANA 


ea D (FT p70? A 07). 


II 


since the tensor is antisymmetric in the lower indices. Similarly, the tensorial 
components of the curvature are 


1 
Q = Z R° ays ea GHP O00, 
= ex © 0? Q (R° py507 A O°). 


The connection forms 
w = ws 8 ea Q 0? (6.48) 


are matrix-valued, but they are not tensorial forms. If T is a type (") tensor 
field, and a a p-form, we can write a tensor-valued p-form as T @a@ € 0?(M, E) 
is. We seek an operator that behaves like a covariant derivative V for tensors 
and exterior derivative d for forms. 


6.4.1 Linear Connections 


Given a vector field X and a smooth function f, we define a linear connection 
as a map 
Vx: r(T7)>T(T?) 
with the following properties 
1) Vx(f) = X(F), 
1) Vext = fDxT, 
2) Vx+yT = VxT + VyT, for all X,Y € X (M), 
3) Vx(T1 + T2) = Vx + VxTe, for Ti, Tə € 1 (TY), 
4) Vx(FT) = X(P)T + fVxT. 
If instead of the tensor bundle we have a general vector bundle Æ, we replace the 
tensor fields in the definition above by sections s € I(E) of the vector bundle. 
The definition induces a derivation on the entire tensor algebra satisfying the 
additional conditions, 
5) Vx ® Tz) = VxT; © T2 +7; ® VxTa, 
6) Vx oC = CoVx, for any contraction C. 
The properties are the same as a Koszul connection, or covariant derivative for 
tensor-valued 0 forms T. Given an orthonormal frame, consider the identity 
tensor, 
I = "g ea 0°, (6.49) 


6.4. BIG D 201 


and take the covariant derivative Vx. We get 


Vx€g I0% +e, Vx% =0, 
ea D Vx" = -V xea 8 0%, 
= -eg WP a(X) @ 6°, 
eg D Vx? = —eg w al X) 8 09, 
which implies that, 
V x0? = —w? (Xe. (6.50) 


Thus as before, since we have formulas for the covariant derivative of basis vec- 
tors and forms, we are led by induction to a general formula for the covariant 
derivative of an (")-tensor given mutatis mutandis by the formula 3.32. In other 
words, the covariant derivative of a tensor acquires a term with a multiplica- 
tive connection factor for each contravariant index and a negative term with a 
multiplicative connection factor for each covariant index. 


6.4.1 Definition A connection V on the vector bundle EF is a map 
V :T(M.E) > T(M,E@T*(M)) 


which satisfies the following conditions 
a) V(T1 + T>) =VT,+VT2, %,T2€ r(E)), 
b) V(fT) =df ƏT + fVT, 
c) VxT =ixVT. 


As a reminder of the definition of the inner product ix, condition (c) is equiv- 
alent to the equation, 


VEO ech XX henge) = (Vx T)(0t,...,0", X1, ..., X6). 
In particular, if X is vector field, then, as expected 
VX(Y) = VxY, 


The operator V is called the covariant differential. Again, for a general vector 
bundles, we denote the sections by s € T(E) and the covariant differential by 
Vs. 


6.4.2 Affine Connections 


A connection on the tangent bundle T(M) is called an affine connection. In 
a local frame field e, we may assume that the connection is represented by a 
matrix of one-forms w 


Veg = ea @wg, 
Ve=e@u. (6.51) 


The tensor multiplication symbol is often omitted when it is clear in context. 
Thus, for example, the connection equation is sometimes written as Ve = ew. 


202 CHAPTER 6. RIEMANNIAN GEOMETRY 


In a local coordinate system {z1,...,7”}, with basis vectors ô, = son and dual 
forms dx", we have, 
way = T3, dr". 


From equation 6.50, it follows that 
V0% = —w%_ Q 0P. (6.52) 


We need to be a bit careful with the dual forms 0%. We can view them as a 
vector-valued 1-form 
0 = e 8 07, 


which has the same components as the identity (i) tensor. This is a kind of a 
odd creature. About the closest analog to this entity in classical terms is the 
differential of arc-length 


dx = idx +jdy + kdz, 


which is sort of a mixture of a vector and a form. The vector of differential 
forms would then be written as a column vector. 
In a frame {ea}, the covariant differential of tensor-valued 0-form T is given by 


VT =V. T90 = Val & 6%. 


In particular, if X = v%e,, we get, 


VX = VX 80° = Ve(v%eq) Q 0°, 
m (Va(v%) €a + vT 3a En) & oÊ, 
= (vh + v T3 )ea Q 68 
= vila ea Q 0P, 


where we have used the classical symbols 
v“ 118 — v“ g + V3, vi (6.53) 


for the covariant derivative components vig and the comma to abbreviate the 
directional derivative Vg(v%). Of course, the formula is in agreement with 
equation 3.25. VX isa (;)-tensor. 
Similarly, for a covariant vector field a = v.0°%, we have 
Va = V(va ® 6%) 

= Vva 8 0% — vp wP a 80“, 

= (VyUa 67 — verh 0) 90°, 

= (Yay = T84 Va) 07 @ 0°, 
hence, 

Val = Vay — Vay Va- (6.54) 


6.4. BIG D 203 


As promised earlier, we now prove the Ricci identities for contravariant and 
covariant vectors when the torsion is not zero. Ricci Identities with torsion. 
The results are, 

XM ag — X" pa = X” RY yap — X" T" ag, 

Ayllee — Xulba = -Xv R” pap ed ap, (6.55) 


We prove the first one. Let X = X“e,. We have 


VX =VeX @ 6, 
V7X = V(VpX @ 6°), 
= V(VpX) @ 0° + VX @ VO, 
= VaVpX 9 0f @ 0% -Vg @ why 8 0%, 
= VaVpX 8 0° 80° — VX OTHO? @ 0%, 
V?X = (VaVe X — V, X Ths) 09 @ 0%. 


On the other hand, we also have VX = VX ® 6%, so we can compute V? by 
differentiating in the reverse order to get the equivalent expression, 


WX =(VaVaX — V X T4) 07 @ 0°. 


Subtracting the last two equations we get an alternating tensor, or a two-form 
that we can set equal to zero. For lack of a better notation we call this form 
[V, V]. The notations Alt(V?) and V A V also appear in the literature. We get 


[V, V] =[VaVp— VeVa)X — Va X (THp — 13,10 A 0%, 

VaVe — VeVa— Viena) X + VieaaX — Vu X (Th, — Tha )]0 A 0%, 
(€a,e8)X + CKV uX — Vu X (Th, — Thal? A 0°, 

(ea, ep)X + Chg — Vu Xha — Tha — Chelo? A 0°, 


= 4(X” RY yap — Vu X T" ag)? A 0%. 


R 
R 


[ 
[ 
[ 
[ 


6.4.3 Exterior Covariant Derivative 


Since we know how to take the covariant differential of the basis vectors, 
covectors, and tensor products thereof, an affine connection on the tangent 
bundle induces a covariant differential on the tensor bundle. It is easy to get a 
formula by induction for the covariant differential of a tensor-valued 0-form. A 
given connection can be extended in a unique way to to tensor-valued p-forms. 
Just as with the wedge product of a 0-form f with a p-form a for which identify 
fa with f&a = f Aa, we write a tensor-valued p form as T@a = T Aa, where 
T is a type (") tensor. We define the exterior covariant derivative 


D:?(M, E) > 0°t'(M, E) 


204 CHAPTER 6. RIEMANNIAN GEOMETRY 


by requiring that, 
D(T ®a) =D(T Aa), 
= VT Aa+ (-1)?T Ada. (6.56) 


it is instructive to show the details of the computation of the exterior covariant 
1 


derivative of the vector valued one forms 0 and O, and the G) tensor-valued 
2-form Q. The results are 
DO% = d9% +.w%g A bÊ, 
DO* = dO* + w%s A OP, 
DQ g = dF g + Wy AM — AM, A wg (6.57) 
The first two follow immediately, we compute the third. We start by writing 
Q = 2% ea Q 6%, 
= (ex D 0") AN g. 
Then, 
DQ = Deg @ 6°) A 2% + (—1)? (ea 8 67) A dQ” 5, 
= (Deo Q 6° + ea Q DO®) A 2% g + (ea Q 67) A d2%g, 
= (e4 Q wg @ OF + ea Q wh, Q OVAN + lea Q OF) A dO%g, 
= (ea Q 0P) A (dQ g + we, AQT — wry AN g). 
In the last step we had to relabel a couple of indices so that we could factor out 
(ea Q 0°). The pattern should be clear. We get an exterior derivative for the 


forms, an w A Q term for the contravariant index and an Q A w term with the 
appropriate sign, for the covariant index. Here the computation gives 


DQ” g = dQ% g + wy A 27% — w%y AO, or 


DQ = dw +wAQ-OAAw. (6.58) 
This means that we can write the equations of structure as 
0 = D9, 
Q = dw +w ^w, (6.59) 
and the Bianchi’s identities as 
DO=0A8, 
DQ=0 (6.60) 


With apologies for the redundancy, we reproduce the change of basis formula 
3.49. Let e’ = eB be an orthogonal change of basis. Then 


De’ = e & dB + DeB, 
=e Q&Q dB + (e 8w)B, 
=e' Q (B7'4B + B-'wB), 


/ 1 
=e ®u, 


6.4. BIG D 205 


where, 
w= B'dB+B'wB. (6.61) 


Multiply the last equation by B and take the exterior derivative d. We get. 
Bu! =dB+wB, 
Bdw' + dB Aw! = dwB — w ^ dB, 
Bdw' + (Bw' — wB) Aw = dwB — w A (w'B — wB), 
B(dw' +w Aw’) = (dw +w ^A w)B, 


Setting Q = dw +w ^w, and @ = dw’ +w’ Au”, the last equation reads, 
Q = B7'QB. (6.62) 


As pointed out after equation 3.49, the curvature is a tensorial form of adjoint 
type. The transformation law above for the connection has an extra term, 
so it is not tensorial. It is easy to obtain the classical transformation law 
for the Christoffel symbols from equation 6.61. Let {2%} be coordinates in a 
patch (ġa, Ua), and {yf} be coordinates on a overlapping patch (¢g,Ug). The 
transition functions ag are given by the Jacobian of the change of coordinates, 


oan a 

Oy® yl Axe’ 
ox“ 

Ai 


Inserting the connection components wg = T$ dy”, into the change of basis 
formula 6.61, with B = ag, we gett, 


w'e = (B7!)* dB" g + (Bo), w" Bs, 
aa (Z) Oye p Bae 


ðr \dy8) Ore ` Oy? 
Oy* o?r" Oy® Ox 
T'e dy? = dy? T“ de? —— 
yY ax" Oy? Oy? $ dr 7 oyl’ 
re Oy® o?r" 3Y% ae OLT Ox 


By ~ Ore OyVOys Ər" M Ay AyF’ 


Thus, we retrieve the classical transformation law for Christoffel symbols that 
one finds in texts on general relativity. 


~ y% Ox" Ox _ Oy oa" 
7 Ox" Ay? Oy® © Ax* AyVAys’ 
lWe use this notation reluctantly, to be consistent with most literature. The notation 


results in violation of the index notation. We really should be writing “g, since in this case, 
the transition functions are matrix-valued. 


re =T (6.63) 


206 CHAPTER 6. RIEMANNIAN GEOMETRY 


6.4.4 Parallelism 


When first introduced to vectors in elementary calculus and physics courses, 
vectors are often described as entities characterized by a direction and a length. 
This primitive notion, that two such entities in R” with the same direction and 
length represent the same vector, regardless of location, is not erroneous in the 
sense that parallel translation of a vector in R” does not change the attributes 
of a vector as described. In elementary linear algebra, vectors are described 
as n-tuples in R” equipped with the operations of addition and multiplication 
by scalar, and subject to eight vector space properties. Again, those vectors 
can be represented by arrows which can be located anywhere in R” as long 
as they have the same components. This is another indication that parallel 
transport of a vector in R” is trivial, a manifestation of the fact the R” is a flat 
space. However, in a space that is not flat, such as s sphere, parallel transport 
of vectors is intimately connected with the curvature of the space. To elucidate 
this connection, we first describe parallel transport for a surface in R3. 


6.4.2 Definition Let u®(t) be a curve on a surface x = x(u%), and let 
V = a' (t) = al 2) be the velocity vector as defined in 1.25. A vector field Y 
is called parallel along a if 

VvY =0, 


as illustrated in figure 6.2. The notation 


is also common in the literature. The vector field VyV is called the geodesic 
vector field, and its magnitude is called the geodesic curvature k, of a. As usual, 
we define the speed v of the curve by ||V|| and the unit tangent T = V/||V]|, so 
that V = vT. We assume v > 0 so that T is defined on the domain of the curve. 
The arc length s along the curve is the related to the speed by the equation 
v = ds/dt. 


6.4.3 Definition A curve a(t) with velocity vector V = a’(t) is called a 
geodesic or self-parallel if Vy V = 0. 


6.4.4 Theorem A curve a(t) is geodesic iff 
a) v = ||V|| is constant along the curve and, 
b) either VrT = 0, or kg = 0. 
Proof Expanding the definition of the geodesic vector field: 


VvV = Vor(vT), 
= vVr(vT), 


6.4. BIG D 207 


We have < T,T >= 1, so < VrT,T >= 0 which shows that V rT is orthogonal 
to T. We also have v > 0. Since both the tangential and the normal components 
need to vanish, the theorem follows. 


If M is a hypersurface in R” with unit normal n, we gain more insight 
on the geometry of geodesics as a direct consequence of the discussion above. 
Without real loss of generality consider the geometry in the case of n = 3. 
Since a is geodesic, we have ||a’||? =< a’,a’ >=constant. Differentiation gives 
< a’,a” >= 0, so that the acceleration a” is orthogonal to a’. Comparing 
with equation 4.34 we see that T’ = k,n, which reinforces the fact that the 
entire curvature of the curve is due to the normal curvature of the surface as 
a submanifold of the ambient space. In this sense, inhabitants constrained to 
live on the surface would be unaware of this curvature, and to them, geodesics 
would appear locally as the straightest path to travel. Thus, for a sphere in 
R? of radius a, the acceleration a” of a geodesic only has a normal component, 
and the normal curvature is 1/a. That is, the geodesic must lie along a great 
circle. 


6.4.5 Theorem Let a(t) by curve with velocity V. For each vector Y in the 
tangent space restricted to the curve, there is a unique vector field Y(t) locally 
obtained by parallel transport. 


Proof We choose local coordinates with frame field {ex = z} We write the 
components of the vector fields in terms of the frame 


ð 
Y = yf — 
y Ou?’ 
du* ð 
eee t 
y dt Ou® z 
VTV = Vuze, (Y ep), 
= uV ea (yes), 
du“ Oy? B 
Sa Y TT Bey, 
dt Ou TURY Bey 
dy” gdu y 
Sa a O 


The existence and uniqueness of the coefficients yf that define Y are guaran- 
teed by the theorem on existence and uniqueness of differential equations with 
appropriate initial conditions. 


We derive the equations of geodesics by an almost identical computation. 


208 CHAPTER 6. RIEMANNIAN GEOMETRY 


VvV = Vace, [ues], 


=U 
= [ü + ù ùf TI len. 
Thus, the equation for geodesics becomes 
i? +rZgù ù? = 0. (6.65) 


The existence and uniqueness theorem for solutions of differential equations 
leads to the following theorem 


6.4.6 Theorem Let p be a point in M and V a vector T,M. Then, for any 
real number to, there exists a number ô and a curve a(t) defined on [to— ô, to+ô], 
such that a(to) = p, a’ (to) = V, and a is a geodesic. 


For a general vector bundles Æ over a manifold M, a section s € T(E) of a 
vector bundle is called a parallel section if 


Vs =0. (6.66) 


We discuss the length minimizing properties geodesics in section 6.6 and provide 
a number of examples for surfaces in R? and for Lorentzian manifolds. Since 
geodesic curves have zero acceleration, in Euclidean space they are straight 
lines. In Einstein’s theory of relativity, gravitation is a fictitious force caused 
by the curvature of space time, so geodesics represent the trajectory of free 
particles. 


6.5 Lorentzian Manifolds 


The formalism above refers to Riemannian manifolds, for which the metric 
is positive definite, but it applies just as well to pseudo-Riemannian manifolds. 
A 4-dimensional manifold {M, g} is called a Lorentzian manifold if the metric 
has signature (+ — ——). Locally, a Lorentzian manifold is diffeomorphic to 
Minkowski’s space which is the model space introduced in section 2.2.3. Some 
authors use signature (— +++). 

For the purposes of general relativity, we introduce the symmetric tensor 
Ricci tensor Rgs by the contraction 


Rgs = R% goss (6.67) 


6.5. LORENTZIAN MANIFOLDS 209 


and the scalar curvature R by 
R= R%z. (6.68) 
The traceless part of the Ricci tensor 
Gag = Rag — Ragas: (6.69) 


is called the Einstein tensor. The Einstein field equations (without a cosmo- 
logical constant) are 


Gog = — Tag, (6.70) 


where T is the stress energy tensor and G is the gravitational constant. As I 
first learned from one of my professors Arthur Fischer, the equation states that 
curvature indicates the presence of matter, and matter tells the space how to 
curve. Einstein equations with cosmological constant A are, 


8rG 
Rag — 4 Rgag + Agag = a lap (6.71) 


Fig. 6.3: Gravity 


A space time which satisfies 


is called Ricci-flat. A space which the Ricci tensor is proportional to the metric, 


is called an Einstein manifold 


6.5.1 Example: Vaidya Metric 

This example of a curvature computation in four-dimensional space-time 
is due to W. Israel. It appears in his 1978 notes on Differential Forms in 
General Relativity, but the author indicates the work arose 10 years earlier 
from a seminar at the Dublin Institute for Advanced Studies. The most general, 


210 CHAPTER 6. RIEMANNIAN GEOMETRY 


spherically symmetric, static solution of the Einstein vacuum equations is the 
Schwarzschild metric ? 


ds” = [1 — 22") av? 


r 


1 
[1 = gai 


r 


dr? — r?d0” — r° sin? 6 de’ (6.74) 


It is convenient to set m = GM and introduce the retarded coordinate trans- 
formation 
t=ut+r+2mln(5 — 1), 

so that, 

dt = du + Ta Dae) 
eal 
Substitution for dt above gives the metric in outgoing Eddington-Finkelstein 
coordinates, 


ds? = 2drdu + [1 — 2] du? — r7d6? — r? sin? 6 dg’. (6.75) 


In these coordinates it is evident that the event horizon r = 2m is not a real 
singularity. The Vaidya metric is the generalization 


ds? = 2drdu + [1 — 7220] du? — r?d6? — r? sin? 0 d¢?, (6.76) 


where m(u) is now an arbitrary function. The geometry described by the Vaidya 
solution to Einstein equations, represents the gravitational field in the exterior 
of a radiating, spherically symmetric star. In all our previous curvature compu- 
tations by differential forms, the metric has been diagonal; this is an instructive 
example of one with a non-diagonal metric. The first step in the curvature com- 
putation involves picking out a basis of one-forms. The idea is to pick out the 
forms so that in the new basis, the metric has constant coefficients. One possible 
choice of 1-forms is 


6° = du, 

0! = dr + 41 — 220] du, 

6? = r dé, 

6° = rsin@ dé. (6.77) 


In terms of these forms, the line element becomes 
ds? = g.g0%0° = 20°91 — (97)? — (6°)?, 


where 


go1 = gio = —922 = —933 = 1, 
while all the other gag = 0. In the coframe, the metric has components: 


0 1 0 0 
1 0 0 0 
Ge on, aE (6.78) 
0 0 0 -I1 
?The Schwarzschild radius is r = 26M but here we follow the common convention of 


setting c = 1. 


6.5. LORENTZIAN MANIFOLDS 211 


Since the coefficients of the metric are constant, the components wag of the 
connection will be antisymmetric. This means that 


Woo = W11 W22 = W33 0. 


We thus conclude that 
wta = 9 "woo = 0, 
wy = gw =0, 


2 = 2990 = 
Wy = g w22 = 0, 


3 88 = 
Ws = g” wss = 0. 


To compute the connection, we take the exterior derivative of the basis 1-forms. 
The result of this computation is 


dô = —d[@du] = Z dr A du = Z 0t A0, 
T E 


AE N S Liia 2m] 69 A @?, 
r 2r T 
d0? = sind dr ^ dọ + r cos 0 d8 A dọ, 


1 1 
T 0° A 03 + — cot 6 6? A0. (6.79) 
T 


1 
-01 Ag — 
: 


For convenience, we write below the first equation of structure [6.24] in complete 
detail. 

d0? = w AG + whi AO +w% AO H A 83, 

dO = wta AM +a, AO wta AP wt AG, 

d8? = w AO +07, AO +072 A +075 A 6%, 

d0? =u, AO +47, AO +A + ws A O. (6.80) 


Since the w’s are one-forms, they must be linear combinations of the 6’s. Com- 
paring Cartan’s first structural equation with the exterior derivatives of the 
coframe, we can start with the initial guess for the connection coefficients be- 
low: 


wy = 0, why = 5 0, wi, =A, wl, = B®, 
1 2 1 

w = -5l - e, w= r 0’, wa =0, w’ =C 0, 
1 2m 1 

wg = a w= = 6, wi, = = cot 0 6", ws = 0. 


Here, the quantities A, B, and C are unknowns to be determined. Observe that 
these are not the most general choices for the w’s. For example, we could have 
added a term proportional to 6! in the expression for wt}, without affecting the 
validity of the first structure equation for d6!. The strategy is to interactively 


212 CHAPTER 6. RIEMANNIAN GEOMETRY 


tweak the expressions until we set of forms completely consistent with Cartan’s 
structure equations. 

We now take advantage of the skewsymmetry of wag, to determine the other 
components. The find A, B and C, we note that 


1 _ „10 = — 2 
We= JG Wo2 = —W20 = 9; 

i: ©. 210 _ “673 
W3=9 Wo3 = —W30 ZÙ 9, 

2 22 3 
W 359g W23 = W32 = —W 9. 


Comparing the structure equations 6.80 with the expressions for the connection 
coefficients above, we find that 


1 2 Ak 2 1 
Aes m pa T eS == cere, (6.81) 
2 r 2 r r 
Similarly, we have 
wy =U 1, 
w? = w’), 
oy =E 
hence, 
i? = ee 
1 
ws = ue 
1 
w? = ae 


It is easy to verify that our choices for the w’s are consistent with first structure 
equations, so by uniqueness, these must be the right values. 

There is no guesswork in obtaining the curvature forms. All we do is take 
the exterior derivative of the connection forms and pick out the components of 
the curvature from the second Cartan equations [6.25]. Thus, for example, to 
obtain Q',, we proceed as follows. 


1 1 1 1 1 2 1 3 
OQ, = dw tw Awa HW a Aw +w3Aw, 


M 1 2m, 4 3 2 p2 3, 93 
sd g trt b= =) 3 Aw? + (04 A0 +E A0), 
2 
= —dr AO, 
7 
2 
=— 0 ng 
7 


The computation of the other components is straightforward and we just present 


the results. 
1 dm 


r2 du 


n, = 


OAP- Orne, 
r 


6.6. GEODESICS 213 


a, = -3 8 ao — =o A 63, 
9? = “30° A, 
8, = “6° A 8°, 
= ae A608 


By antisymmetry, these are the only independent components. We can also 
read the components of the full Riemann curvature tensor from the definition 


a 1 a ô 

Thus, for example, we have 
1 1 1 y ô 
Q 1 = gk 1759 mx 0 ; 


hence 


2 
Th other R's =/0; 


SOEN fer. m 
R i0 =R 110 = re 


Using the antisymmetry of the curvature forms, we see, that for the Vaidya 
metric Q!) = Qoo = 0, N29 = —O14, etc., so that 
Roo = Ro29 + R*os0 
1 1 
R 220 + R 330 


II 


Substituting the relevant components of the curvature tensor, we find that 
Roo = 2— — (6.83) 
r 


while all the other components of the Ricci tensor vanish. As stated earlier, if 
m is constant, we get the Ricci flat Schwarzschild metric. 


6.6 Geodesics 


Geodesics were introduced in the section on parallelism. The equation of 
geodesics on a manifold given by equation 6.65 involves the Christoffel symbols. 
Whereas it is possible to compute all the Christoffel symbols starting with the 
metric as in equation 4.76, this is most inefficient, as it is often the case that 
many of the Christoffel symbols vanish. Instead, we show next how to obtain 
the geodesic equations by using variational principles 


5 f us, ù“, s) ds = 0, (6.84) 


214 CHAPTER 6. RIEMANNIAN GEOMETRY 


to minimize the arc length. Then we can pick out the non-vanishing Christoffel 
symbols from the geodesic equation. Following the standard methods of La- 
grangian mechanics, we let u” and ù“ be treated as independent (canonical) 
coordinates and choose the Lagrangian in this case to be 


L= gopiru. (6.85) 


The choice will actually result in minimizing the square of the arc length, but 
clearly this is an equivalent problem. It should be observed that the Lagrangian 
is basically a multiple of the kinetic energy imo’. The motion dynamics are 
given by the Euler-Lagrange equations. 


d (OL aL 
a ( sa) = 0. (6.86) 


Applying this equations keeping in mind that gag is the only quantity that 
depends on u“, we get: 


0 = A [gag ù? + gaptt 62] — gap yuu? 


= 2 loan? + Jay" = Jap yi ù 


= gypüÊ + gayit® + gyp aÙ Ù + Jay pù B 


Pie — gap thts 


= 2g,,gii° [976a } Jay,B Jap lù ù 
E gu? + $977 [gya + Jay, B — Jap yli ù? 


where the last equation was obtained contracting with $ g% to raise indices. 
Comparing with the expression for the Christoffel symbols found in equation 
4.76, we get 

ü + TZ ,u°u? =0 


which are exactly the equations of geodesics 6.65. 


6.6.1 Example Geodesics of sphere 
Let S? be a sphere of radius a so that the metric is given by 


ds? = a?°d0? + a? sin? 6 d¢?. 
Then the Lagrangian is 
L = a?ĝ? + a? sin? 6 Q. 
The Euler-Lagrange equation for the ¢ coordinate is 


d es OL 
ds ` Ag ð 
d x 
— (2a? sin? 0¢ġ) = 0, 
ds 
and therefore the equation integrates to a constant 


sin? 6 ¢=k. 


6.6. GEODESICS 215 


Rather than trying to solve the second Euler-Lagrange equation for 0, we evoke 
a standard trick that involves reusing the metric. It goes as follows: 


do 
+2 
0 — =k 
sin’ 0 z; i 
sin? 0 do = k ds, 
sinf 0 do? = k?ds?, 
sin’ 0 do? = k’ (a°d0? + a? sin? 6 dd”), 
(sinf 0 — k?a? sin? 0) do? = a? k? d0’. 
The last equation above is separable and it can be integrated using the substi- 
tution u = cot 0. 
- dO, 
sin 0V sin? 0 — a2k? 
O ak 
sin? 0V1 — a?k? csc2 0 
A do 
sin? 04/1 — a2k2(1 + cot? 6) 
ak csc? 6 iB 
/1 — a2k2(1 + cot? 0) 
ak csc? 6 
/(1 — a2k2) — a2k? cot? 0 


29 
E csc 40 


—aq2k2 
1-7 — cot? 0 


—1 
i 2 a 
= SS | where( c = =p). 
cu 


o = —sin™'(4 cot 6) + ġo. 


dọ = 


3 


Here, ¢o is the constant of integration. To get a geometrical sense of the 
geodesics equations we have just derived, we rewrite the equations as follows: 
cot 0 = csin(¢o — 4), 
cos = csin 0(sin do cos ¢ — cos do sin d), 
acos@ = (csin ġo) (asin 0 cos ¢) — (ccos do) (asin 0 sin ¢.) 
z= Ax — By, where A = csin ġo, B = ccos ġo. 


We conclude that the geodesics of the sphere are great circles determined by 
the intersections with planes through the origin. 


6.6.2 Example Geodesics in orthogonal coordinates. 
In a parametrization of a surface in which the coordinate lines are orthogonal, 
F =0. Then first fundamental form is, 


ds? = Edu? + Gdv’, 


216 CHAPTER 6. RIEMANNIAN GEOMETRY 


and we have the Lagrangian, 
L= Ew? + Go. 


The Euler-Lagrange equations for the variable u are: 


d (2Eu) — Eu? — Gub? = 0, 
ds 
2Fii + (Eù + 2E,0)u — Euù? — Gub? = 0, 


2Eü + Eù? + 2Eytw — Gt? = 0. 


Similarly for the variable v, 


2 (2Gu) — BE,’ — Gò? = 0, 
ds 
IGS + (2Guù + 2G ù)ù — E,W — Gò? = 0, 


2Gü — E ù? + 2Guùb + Gv? = 0. 


So, the equations of geodesics can be written neatly as, 


e 1 -2 ate +2) 

i+ om [E,u" + 2B, tb — G,b*] = 0, 

RE 1 +2 an 2) — 

b+5G [Gù + 2G, 00 — E,u"| = 0. (6.87) 


6.6.3 Example Geodesics of surface of revolution 
The first fundamental form a surface of revolution z = f(r) in cylindrical coor- 
dinates as in 4.7, is 

ds? = (1 + f°?) dr? +r? d¢?, (6.88) 


Of course, we could use the expressions for the equations of geodesics we just 
derived above, but since the coefficients are functions of r only, it is just a easy 
to start from the Lagrangian, 


L= (14+ fP) ?+17¢?. 


Since there is no dependance on ¢, the Euler-Lagrange equation on ¢ gives rise 
to a conserved quantity. 


d 2i 
ge ¢) =0, 


ro=c (6.89) 


where c is a constant of integration. If the geodesic a(s) = a(r(s), ¢(s)) rep- 
resents the path of a free particle constrained to move on the surface, this 
conserved quantity is essentially the angular momentum. A neat result can be 
obtained by considering the angle o that the tangent vector V = a’ makes with 


6.6. GEODESICS 217 


a meridian. Recall that the length of V along the geodesic is constant, so let’s 
set ||V|| = k. From the chain rule we have 


Then 


<a',x,> Ge 
lla’ Ixell kvg’ 
Se ae 


cos g = 


We conclude from 6.89, that for a surface of revolution, the geodesics make an 
angle ø with meridians that satisfies the equation 


r cosa = constant. (6.90) 


This result is called Clairaut’s relation. Writing equation 6.89 in terms of dif- 
ferentials, and reusing the metric as we did in the computation of the geodesics 
for a sphere, we get 

r? dọ = c ds, 
rt dæ = e ds’, 
=P |(1 + f?) dr? +r? dé’), 
(r4 — or?) dg? = (1 + f?) dr?, 


ryr?— œe do = cV1 + f2 dr, 


so 


Jit f? 
@ te f rae dr. (6.91) 
If c = 0, then the first equation above gives ¢ =constant, so the meridians are 
geodesics. The parallels r =constant are geodesics when f’(r) = co in which 
case the tangent bundle restricted to the parallel is a cylinder with a vertical 
generator. 

In the particular case of a cone of revolution with a generator that makes 
an angle a with the z-axis, f(r) = cot(a)r, equation 6.91 becomes: 


b V1+cot2a 
= EC — a 
ry r2? — ce 


which can be immediately integrated to yield 


= +cscasec™! (r/c) (6.92) 


As shown in figure 6.4, a ribbon laid flatly around a cone follows the path of 
a geodesic. None of the parallels, which in this case are the generators of the 
cone, are geodesics. 


218 CHAPTER 6. RIEMANNIAN GEOMETRY 


Fig. 6.4: Geodesics on a Cone. 


6.7 Geodesics in GR 


6.7.1 Example Morris-Thorne (MT) wormhole 


In 1987, Michael Morris and Kip Thorne from the California Institute of Tech- 
nology proposed a tantalizing simple model for teaching general relativity, by 
alluding to interspace travel in a geometry of traversable wormhole. We con- 
straint the discussion purely to geometrical aspects of the model and not the 
physics of stress and strains of a “traveler” traversing the wormhole. The MT 
metric for this spherically symmetric geometry is 


ds? = —c*dt? + dl? + (b + 1”) (d0? + sin? 6 dd”), (6.93) 
where bo is a constant. The obvious choice for a coframe is 


O° = cdt, 8? = b2 +? d0, 
6 = dl, 63 = \/b2 + [2 sind dé. 


We have d0? = d@! = 0. To find the connection forms we compute d0? and d0’, 
and rewrite in terms of the coframe. We get 


l l 
de? = ——— dl A d0 = -_——~ do ^ dl, 
fb? + I? Jb? +E 
l 2 1 
“gee Oe 
l 
do? = ——— sin dl A d¢ + cos 6\/b2 + 12d0 A dọ, 
Va 1? 
= l 3 gh cot? as, pe 
= mae? AO ze? AO". 


Comparing with the first equation of structure, we start with simplest guess for 


6.7. GEODESICS IN GR 219 


the connection forms w’s. That is, we set 


2 2 
Uere 
w3 = ae 93 
B+ R 0? 

3 cot 0 93 


0" JEFE 


Using the antisymmetry of the w’s and the diagonal metric, we have w?, = 
—wty, wis = —w%,, and w? = —w%,. This choice of connection coefficients 
turns out to be completely compatible with the entire set of Cartan’s first 
equation of structure, so, these are the connection forms, all other w’s are 
zero. We can then proceed to evaluate the curvature forms. A straightforward 


calculus computation which results in some pleasing cancellations, yields 


O's = dw! Fwi Aw? 7 “+E g! N 67, 

Q's = dw'3 + wta Aw? =o _ ging 
3 3 W 2 W 3 = (b2 + 12)2 ’ 

073 = dw? +w’ Aw" =o __ gi ngs 
3 3 Wt 5 (02 + 12)2 a 


Thus, from equation 6.36, other than permutations of the indices, the only 
independent components of the Riemann tensor are 


b 
R2323 = — Rı212 = R1313 = CETO 


and the only non-zero component of the Ricci tensor is 


b2 
Ry = 2—5. 
á (b3 +12)? 
Of course, this space is a 4-dimensional continuum, but since the space is spher- 
ically symmetric, we may get a good sense of the geometry by taking a slice 
with 6 = 7/2 at a fixed value of time. The resulting metric ds2 for the surface 
is 

ds? = dl? + (b? + 1?) d’. (6.94) 
Let r? = b? + [?. Then dl? = (r?/I?) dr? and the metric becomes 


2 
ds2 = =p z dr? +9? dg, (6.95) 


oO 


Se oo +r? de’. (6.96) 


220 CHAPTER 6. RIEMANNIAN GEOMETRY 


Comparing to 4.26 we recognize this to be a catenoid of revolution, so the equa- 
tions of geodesics are given by 6.91 with f(r) = bo cosh” '(r/bo). Substituting 
this value of f into the geodesic equation, we get 


$= (6.97) 


1 
EC / dr. 
Jr? — byr? — 2 
There are three cases. If c = bo, the integral gives immediately 


$ = +(c/bo) tanh *(r/bo). 


Fig. 6.5: Geodesics on Catenoid. 
We consider the case c > bọ. The remaining case can be treated in a similar 


fashion. Let r = c/sin f. Then Vr? — c2 = rcosf and dr = —r cot 8 d, so, 
assuming the initial condition ¢(0) = 0, the substitution leads to the integral 


6 1 (=r cos 3) 
$= f ay 
0 r cos By/ sq — b2 sin 8 
7 1 
ce f — df, 
0 4/c2 — b? sin? B 


s 1 
f NEET dB, (k = bo/c) (6.98) 
= F(s, k), (6.99) 


where F(s, k) is the well-known incomplete elliptic integral of the first kind. 

Elliptic integrals are standard functions implemented in computer algebra 
systems, so it is easy to render some geodesics as shown in figure 6.5. The plot 
of the elliptic integral shown here is for k = 0.9. The plot shows clearly that 
this is a 1-1, so if one wishes to express r in terms of ¢ one just finds the inverse 
of the elliptic integral which yields a Jacobi elliptic function. Thomas Muller 
has created a neat Wolfram-Demonstration that allows the user to play with 
MT wormhole geodesics with parameters controlled by sliders. 


6.7.2 Example Schwarzschild Metric 


6.7. GEODESICS IN GR 221 


In this section we look at the geodesic equations in a Schwarzschild gravitational 
field, with particular emphasis on the bounded orbits. We write the metric in 
the form 


1 
ds? = —h(r) dt? + ie) dr? + r?(d6? + sin 6 dd”), (6.100) 
where 2GM 
h(r) =1- —. (6.101) 
r 
Thus, the Lagrangian is 
: 1 . ; 
Z=-hťľ4 a 4776? +r’ sin Od”. (6.102) 


The Euler-Lagrange equations for goo, gaz and g33 yield 


d dt 
lop | = 
ds | a 9, 
d [d9] 3. do]? _ 
as b 4 — r^ sin 0 cos 0 H =0, 
d |, 249| _ 


If in the equation for g22, one chooses initial conditions 6(0) = 7/2, 6(0) = 0, 
we get O(s) = 1/2 along the geodesic. We infer from rotation invariance that 
the motion takes place on a plane. Hereafter, we assume we have taken these 
initial conditions. From the other two equations we obtain 


dt 
Lrg 
dé 
22 — 
r ds L: 


for some constants E and L. We recognize the conserved quantities as the 
“energy” and the angular momentum. Along the geodesic of a massive particle, 
with unit time-like tangent vector, we have 


dx” dx” 
zre Die eae (6.103) 


The equations of motion then reduce to 


2 2 2 
=h dt ! 1 [dr ka do , 
ds h |ds ds 


E? T L? 
+ 


222 CHAPTER 6. RIEMANNIAN GEOMETRY 


Hence, we obtain the neat equation, 


F? = H ' +V(r), (6.104) 


where V(r) represents the effective potential. 


V(r) = i ecu] h =). 


m 
2GM LE 2MGL? 
t- : 


=1 (6.105) 


r r r3 
If we let V = V/2 in this expression we recognize the classical 1/r potential, 
and the 1/r? term corresponding to the Coriolis contribution associated with 
the angular momentum. The 1/r? term is a new term arising from general 
relativity. Clearly we must have E? < V(r). There are multiple cases depending 


Fig. 6.6: Effective Potential for L = 3,4,5 


on the values of E and L and the nature of the equilibrium points. Here we are 
primarily concerned with bounded orbits, so we seek conditions for the particle 
to be in a potential well. This presents us with a nice calculus problem. We 
compute V’(r) and set equal to zero to find the critical points 


2 
V'(r) = oa (GMr° — Lr +3GML?) = 0. 
The discriminant of the quadratic is 
D = L? — 12G?M?. 
If D < 0 there are no critical points. In this case, V(r) is a monotonically 
increasing function on the interval (2MG,oo), as shown in the bottom left 
graph in figure 6.6. The maple plots in this figure are in units with GM = 1. 


In the case D < 0, all trajectories either fall toward the event horizon or escape 
to infinity. 


6.7. GEODESICS IN GR 223 


If D > 0, there are two critical points 


7 T- GCM 
SET 2GM i 
P+W- RCM 
PAN 2GM i 


The critical point rı is a local maximum associated with an unstable circular 
orbit. The critical point r2 > rı gives a stable circular orbit. Using the standard 
calculus trick of multiplying by the conjugate of the radical in the first term, 
we see that 


rı > 3GM, 
L2 

2 GM 
as L > oo. For any L, the properties of the roots of the quadratic imply 
that rır = 3L?. As shown in the graph 6.6, as L gets larger, the inner radius 
approaches 3G'M and the height of the bump increases, whereas the outer radius 
recedes to infinity. As the value of D approaches 0, the two orbits coalesce at 
L? = 12G? M?, which corresponds to r = 6G'M, so this is the smallest value of 
r at which a stable circular orbit can exist. Since V(r) > 1 as r > on, to get 
bounded orbits we want a potential well with V(r1) < 1. We can easily verify 
that when L = 4G'M the local maximum occurs at rı = 4GM, which results 
in a value of V(r1) = 1. This case is the one depicted in the middle graph in 
figure 6.6, with the graph of V’(r) on the right showing the two critical points 
at rı =4GM, r2 =12GM. Hence the condition to get a bounded orbit is 


2V3GM < L <4GM, 
E? < V(rı), r > Ti, 


so that the energy results in the particle trapped in the potential well to the 
right of rı. This is the case that applies to the modification of the Kepler orbits 


of planets. If we rewrite 
dr č drdọ Ldr 


ds dọds r2d¢ 
and substitute into equation 6.104, we get 
2 2 2 
TA 
If now we change variables to u = 1/r, we obtain 
du ldr dr 
do r? do do’ 
and the orbit equation becomes 


Z] = a [E? — (1+ L?u2)(1—2GMu)], 


Ldu 
a / VE? — (1+ L?u?)(1 — 2GML?u) 


+ ġo. 


224 CHAPTER 6. RIEMANNIAN GEOMETRY 


The solution of the orbit equation is therefore reduced to an elliptic integral. If 
we expand the denominator 
Ldu 
aa / JÆ 1) +2GMu — Dw + 2GM Ls 
and neglect the cubic term, we can complete the squares of the remaining 


quadratic. The integral becomes one of standard inverse cosine type; hence, 
the solution gives the equation of an ellipse in polar coordinates 


+ do, 


r= . = C(1 + ecos(¢ — ¢o)), 


for appropriate constants C, shift ¢9 and eccentricity e. The solution is auto- 
matically expressed in terms of the energy and the angular momentum of the 
system. More careful analysis of the integral shows that the inclusion of the 
cubic term perturbs the orbit by a precession of the ellipse. While this ap- 
proach is slicker, we prefer to use the more elementary procedure of differential 
equations. Differentiating with respect to @ the equation 


d 2 
L? | =] = (E? — 1) +2GMu — L?u? + 2GM Lu’, 


do 

and cancelling out the common chain rule factor du/d¢, we get 

du GM 2 

Ie =F 74 + 3G@Mu 
Introducing a dimensionless parameter 

3G? M? 
ala Fo 

we can rewrite the equation of motion as 

du GM P , 

de Fu=—74 am E (6.106) 


The linear part of the equation corresponds precisely to Newtonian motion, and 
€ is small, so we can treat the quadratic term as a perturbation 


u = uo + uie + uge +... 


Substituting u into equation 6.106, the first approximation is the linear approx- 
imation given by 

GM 

DP 

The homogenous solution is of the form u = Acos(¢ — ġo), where A and ġo 
are the arbitrary constants, and the particular solution is a constant. So the 
general solution is 


i" 
Ug +U = 


GM 
uo = Tz + Acos(¢ — ¢o), 
GM AL? 


= Fz [1 + ecos($ — ¢o)], e= =r 


6.7. GEODESICS IN GR 225 


Without loss of generality, we can align the axes and set ¢9 = 0. In the 
Newtonian orbit, we would write uo = 1/r, thus getting the equation of a polar 
conic. 


GM 
uo = Tr (l + ecos ø) (6.107) 
In the case of the planets, the eccentricity e < 1, so the conics are ellipses. 
Having found uo we reinsert u into the differential equation 6.106 and keeping 


only the terms of order e. We get 


GM L? 
(uo + uje)” T (uo ST ue) = T2 t gume t uie)’, 
GM L? 
(ug + uo — Fr) + (uy +u)e = auto 


Thus, the result is a new differential equation for u1, 


The equation is again a linear inhomogeneous equation with constant coeffi- 
cients, so it is easily solved by elementary methods. We do have to be a bit 
careful since we have a resonant term on the right hand side. The solution is 


L? 
= zm l 


uy + že?) + 2eġ cos ġ — ze cos 2¢]. 

The resonant term ¢ġ cos ¢ makes the solution non-periodic, so this is the term 
responsible for the precession of the elliptical orbits. The precession is obtained 
by looking at the perihelion, that is, the point in the elliptical orbit at which 
the planet is closest to the sun. This happens when 


du d 
dé ~ g” +u) a 0, 


sind + (sing 4 edcos d+ 3esin ¢) = 0. 


Starting with the solution ¢ = 0, after on revolution, the perihelion drifts to 
p = 2r +ô. By the perturbation assumptions, we assume 6 is small, so to lowest 
order, the perihelion advance in one revolution is 


_ 6G? M? 


ô = 27 T2 


(6.108) 


From equation 6.107 for the Newtonian elliptical orbit, the mean distance a to 
the sun is given by the average of the aphelion and perihelion distances, that is 
ad [?/GM _ L?/GM o L 1 
2| 1+e l—-e | GM1- 


226 CHAPTER 6. RIEMANNIAN GEOMETRY 


Thus, if we divide by the period T, the rate of perihelion advance can be written 
in more geometric terms as 


5= 67GM 
a(l- eT 


The famous computation by Einstein of a precession of 43.1” of an arc per 
century for the perihelion advance of the orbit of Mercury, still stands as one 
of the major achievements in modern physics. 


For null geodesics, equation 6.103 is replaced by 


0= dx” dz” 
= He ds ds’ 


so the orbit given by the simpler equation 


Performing the change of variables u = 1/r, we get 


du 

dg +u= 3GMu?. 
Consider the problem of light rays from a distant star grazing the sun as they ap- 
proach the earth. Since the space is asymptotically flat, we expect the geodesics 
to be asymptotically straight. The quantity 3G&M is of the order of 2km, so it is 
very small compared to the radius of the sun, so again we can use perturbation 
methods. We let € = 3GM and consider solutions of equation 


u” +u = eu’, 


of the form 
u = ugo T ULE. 


To lowest order the solutions are indeed straight lines 


uo = Acoso + Bsind, 
1 = Ar cos ġọ + Br sino, 
1 = Ax + By 


Without loss of generality, we can align the vertical axis parallel to the incoming 
light with impact parameter b (distance of closest approach) 


1 
Uo = 5 cos @. 


As above, we reinsert the u into the differential equation and compare the 


coefficients of terms of order e. We get an equation for u1, 


1 1 
ui tu = ga cos $ z ap + cos 2). 


6.8. GAUSS-BONNET THEOREM 227 


We solve the differential equation by the method of undetermined coefficients 
and thus we arrive at the perturbation solution to order e€ 
1 2€ € 2 
u = -cos + Sz — zr COS” O. 
b 6 3b? 3b? $ 
To find the the asymptotic angle of the outgoing photons, we let r — oo or 
u — 0. Thus we get a quadratic equation for cos ¢. 


Set ¢ = 5 +ô. Since ô is small, we have sind ~ 6, and we see that 6 = 2GM/b 
is the approximation of the deflection angle of one of the asymptotes. The total 
deflection is twice that angle 

4GM 
Sa 
The computation results in a deflection by the sun of light rays from a distant 
star of about 1.75”. This was corroborated in an experiment lead by Eddington 
during the total solar eclipse of 1919. The part of the expedition in Brazil was 
featured in the 2005 movie, The House of Sand. For more details and more 
careful analysis of the geodesics, see for example, Misner Thorne and Wheeler 
[21]. 


26 


6.8 Gauss-Bonnet Theorem 


This section is dedicated to the memory of Professor S.-S. Chern. I prelude 
the section with a short anecdote that I often narrate to my students. In 
June 1979, an international symposium on differential geometry was held at the 
Berkeley campus in honor of the retirement of Professor Chern. The invited 
speakers included an impressive list of the most famous differential geometers 
at the time, At the end of the symposium, Chern walked on the stage of the 
packed auditorium to give thanks and to answer some questions. After a few 
short remarks, a member of the audience asked Chern what he thought was 
the most important theorem in differential geometry. Without any hesitation 
he answered, “there is only one theorem in differential geometry, and that is 
Stokes’ theorem.” This was followed immediately by a question about the most 
important theorem in analysis. Chern gave the same answer: “there is only one 
theorem in analysis, Stokes’ theorem. A third person then asked Chern what 
was the most important theorem in Complex Variables. To the amusement of 
the crowd, Chern responded, “There is only one theorem in complex variables, 
and that that is Cauchy’s theorem. But if one assumes the derivative of the 
function is continuous, then this is just Stokes’ theorem.” Now, of course it 
is well known that Goursat proved that the hypothesis of continuity of the 
derivative is automatically satisfied when the function is holomorphic. But the 
genius of Chern was always his uncanny ability to extract the essential of what 
makes things work, in the simplest terms. 

The Gauss-Bonnet theorem is rooted on the theorem of Gauss (4.72), which 
combined with Stokes’ theorem, provides a beautiful geometrical interpretation 


228 CHAPTER 6. RIEMANNIAN GEOMETRY 


of the equation. This is undoubtedly part of what Chern had in mind at the 
symposium, and also when wrote in his Euclidean Differential Geometry Notes 
(Berkeley 1975 ) [4] that the theorem has “profound consequences and is perhaps 
one of the most important theorems in mathematics.” 

Let (s) by a unit speed curve on an orientable surface M, and let T be the 
unit tangent vector. There is Frenet frame formalism for M, but if we think of 
the surface intrinsically as 2-dimensional manifold, then there is no binormal. 
However, we can define a “geodesic normal” taking G = J(T), where J is 
the symplectic form 5.50, Then the geodesic curvature is given by the Frenet 
formula 

T' = KgG. (6.109) 


6.8.1 Proposition Let {e1,e2} be an orthonormal on M, and let (s) be a 
unit speed curve as above, with unit tangent T. If ¢ is the angle that T makes 
with e1, then 

Ob 4 


= 5, 7 “(T). (6.110) 


Proof Since {T,G} and {e1,e2} are both orthonormal basis of the tangent 
space, they must be related by a rotation by an angle ¢, that is 


T| | cos¢d sing] fe 
a E bo e (6.111) 


Kg 


that is, 


T= (cos @)e; + (sin g)es, 
G = —(sin ġ)eı + (cos d)eo. (6.112) 


Since T = 6’, and 8” = VT we have 
6” = —(sin ee, + cos oV rex + (cos ee, + sin dV rea, 


= ~(sing) Per + (cos9)u% (T)ez + (cos) Sen + (sin d)wh(T)er, 


= (22 -Tsin ge + ÉE — wT leos dea, 
= (22 — WT) singer + (cos gez], 

= (2 — v (r)a, 

= KgG. 


comparing the last two equations, we get the desired result. 

This theorem is related to the notion discussed in figure 6.2 to the effect that 
in a space with curvature, the parallel transport of a tangent vector around a 
closed curve, does not necessarily result on the same vector with which one 


6.8. GAUSS-BONNET THEOREM 229 


started. The difference in angle Ad between a vector and the parallel transport 
of the vector around a closed curve C is called the holonomy of the curve. The 
holonomy of the curve is given by the integral 


A= | wT) ds. (6.113) 


6.8.2 Definition Let C be a smooth closed curve on M parametrized by 
arc length with geodesic curvature Kg. The line integral fc Kg ds is called the 
total geodesic curvature. If the curve is piecewise smooth, the total geodesic 
curvature is the sum of the integrals of each piece. 

A circle of radius R gives an elementary example. The geodesic curvature 
is the constant 1/R, so the total geodesic curvature is (1/R)27R = 2r. 

If we integrate formula 6.110 around a smooth simple closed curve C which 
is the boundary of a region R and use Stokes’ Theorem, we get 


f ro ds= $ dé- $ wh ds, 
-fu f e 


For a smooth simple closed curve, Je dọ = 2r. Using the Cartan-form version 
of the theorema egregium 4.106 we get immediately 


II K as+ f Kg ds = 27. (6.114) 
R c 


Q4 


a 


Fig. 6.7: Turning Angles 


If the boundary of the region consists of k piecewise continuous functions as 
illustrated in figure 6.7, the change of the angle ¢ along C is still 27, but the 
total change needs to be modified by adding the exterior angles az. Thus, we 
obtain a fundamental result called the Gauss-Bonnet formula, 


230 CHAPTER 6. RIEMANNIAN GEOMETRY 


6.8.3 Theorem 
J| Kase [oe ds +S ak = 2r. (6.115) 
R C i 


Every interior v, angle is the supplement of the corresponding exterior a, angle, 
so the Gauss Bonnet formula can also be written as 


Jx ast fso eat ln) =o (6.116) 


The simplest manifestation of the Gauss-Bonnet formula is for a triangle 
in the plane. Planes are flat surfaces, so K = 0 and the straight edges are 
geodesics, so Kg = 0 on each of the three edges. The interior angle version of 
the formula then just reads 3a — tı — t2 — t3 = 27, which just says that the 
interior angles of a flat triangle add up to 7. Since a sphere has constant positive 
curvature, the sum of the interior angles of a spherical triangle is larger than 7. 
That amount of this sum over 27 is called the spherical excess. For example, 
the sum of the interior angles of a spherical triangle that is the boundary of one 
octant of a sphere is 37/2, so the spherical excess is 7/2. 


6.8.4 Definition The quantity f f K dS is called the total curvature 

6.8.5 Example A sphere of radius R has constant Gaussian Curvature 1/R?. 
The surface area of the sphere is 47R?, so the total Gaussian curvature for the 
sphere is 47. 

6.8.6 Example For a torus generated by a circle of radius a rotating about 
an axis with radius b as in example (4.40), the differential of surface is dS = 


a(b+acos6) dédd, and the Gaussian curvature is K = cos @/|a(b + a cos 0)], so 
the total Gaussian curvature is 


27 20 
J I cos 0 dôdo = 0. 
o Jo 


We now relate the Gauss-Bonnet formula to a topological entity. 


nw 


v ANVAN 
ros e 


e 
Torus 
Fig. 6.8: Triangulation 


6.8. GAUSS-BONNET THEOREM 231 


6.8.7 Definition Let M be a 2-dimensional manifold. A triangulation of the 
surface is subdivision of the surface into triangular regions {^x} which are the 
images of regular triangles under a coordinate patch, such that: 

1) M = U; Az. 

2) A:N 4; is either empty, or a single vertex or an entire edge. 

3) All the triangles are oriented in the same direction, 
For an intuitive visualization of the triangulation of a sphere, think of inflating a 
tetrahedron or an octahedron into a spherical balloon. We state without proof: 


6.8.8 Theorem Any compact surface can be triangulated. 


6.8.9 Theorem Given a triangulation of a compact surface M, let V be the 
number of vertices, Æ the number of edges and F the number of faces. Then 
the quantity 

x(M) =V-E+F, (6.117) 


is independent of the triangulation. In fact the quantity is independent of any 
“polyhedral” subdivision. This quantity is a topological invariant called the 
Euler characteristic. 


6.8.10 Example 


1. A balloon-inflated tetrahedron has V = 4, E = 6, F = 4, so the Euler 
characteristic of a sphere is 2. 


2. A balloon-inflated octahedron has V = 6, E = 12, F = 8, so we get the 
same number 2. 


3. The diagram on the right of gigure 6.8 represents a topological torus. In 
the given rectangle, opposites sides are identified in the same direction. 
The number of edges without double counting are shown in red, and the 
number of vertices not double counted are shown in black dots. We have 
V=6, E = 18 F = 12. So the Euler characteristic of a torus is 0. 


4. In one has a compact surface, one can add a “handle”, that is, a torus, by 
the following procedure. We excise a triangle in each of the two surfaces 
and glue the edges. We lose two faces and the number of edges and vertices 
cancel out, so the Euler characteristic of the new surface decreases by 2. 
The Euler characteristic of a pretzel is —4. 


5. The Euler characteristic of an orientable surface of genus g, that is, a 
surface with g holes is given by x(M) = 2 — 2g. 


6.8.11 Theorem Gauss-Bonnet 
Let M be a compact, orientable surface. Then 
1 


— | KdS= (M). (6.118) 
2T M 


232 CHAPTER 6. RIEMANNIAN GEOMETRY 


Proof Triangulate the surface so that M = he Ax. We start the Gauss- 
Bonnet formula 


F 


F 
J hES=£] | Kas=-Y |$ nyds+r (tuatua) : 


k=1 


where F is the number of triangles and the ¿x’s are the interior angles of triangle 
Ax. The line integrals of the geodesic curvatures all cancel out since each edge 
in every triangle is traversed twice, each in opposite directions. Rewriting the 


equation, we get 
Tf K dS = -rF + S 
M 


where S is the sum of all interior angles. Since the manifold is locally Euclidean, 
the sum of all interior angles at a vertex is 27, so we have 


ee K dS = -rF + 2rV 
M 


There are F faces. Each face has three edges, but each edge is counted twice, 
so 3F = 2E, and we have F = 2E — 2F Substituting in the equation above, we 
get, 


Tf K dS = ~r (2E — 2F) + 21V = 2r(V — E + F) = x(M). 
M 


This is a remarkable theorem because it relates the bending invariant Gaus- 
sian curvature to a topological invariant. Theorems such as this one which cut 
across disciplines, are the most significant in mathematics. Not surprisingly, it 
was Chern who proved a generalization of the Gauss-Bonnet theorem to general 
orientable Riemannian manifolds of even dimensions [5]. 


Chapter 7 


Groups of Transformations 


7.1 Lie Groups 


At the IX International Colloquium on Group Theoretical Methods in Physics 
held in 1980 at Cocoyoc, Mexico, one of the invited addresses was delivered 
by the famous mathematician Bertram Konstant, who years later would be 
awarded the Wigner Medal. In his opening remarks, Konstant made the fol- 
lowing intriguing statement, “In the 1800’s, Felix Klein and Sophus Lie decided 
to divide mathematics among themselves. Klein took the discrete and Lie took 
the continuous. I am here to tell you that they were both working on the same 
thing.” 


In this chapter we present an elementary introduction to Lie groups and 
corresponding Lie algebras. A simple example of a Lie group is the circle group 
U(1) = {z €C: |z|? = 1} which, as a manifold, corresponds to the unit circle 
St. In polar form, an element of this group can be written in the form e°. 
The group multiplication z + ez corresponds to a rotation of the vector z 
by an angle 0. Using the matrix representation 5.2.2 for complex numbers and 
Euler’s formula et? = cos@ + isin@, we see that the group is isomorphic to 
SO(2,R) as shown in example 3.4. This example captures the essence of what 
we seek, that is, a group that is also a manifold and that can be associated 
with some matrix group. A U(1) bundle consists of a base manifold M with a 
structure that locally looks like a cross product of an open set in M with U(1). 
Generalizations in which the fibers are Lie groups leads to a structure called 
a principal fiber bundle. Lie Groups, Lie algebras and principal fiber bundles 
provide the mathematical foundation in modelling symmetries in classical and 
quantum physics. The subject is much too rich to give a comprehensive treat- 
ment here, but we hope the material will serve as a starting block for further 
study. 


7.1.1 Definition A Lie group G is a group that is also a smooth manifold. 


233 


234 


CHAPTER 7. GROUPS OF TRANSFORMATIONS 


It is assumed that the usual group multiplication and inverse operations, 


uU:GxGoG, 


(91,92) > 9192, 
L:G>G, 


grog", 


are C™. 


7.1.2 Definition A Lie subgroup is a subset H C G of a Lie group G, that 
is itself a Lie group. 


7.1.3 Examples 


1. The (real) general linear group is the set of n x n matrices, 


GL(n, R) = {A € Mayn(R) : det(A) 4 0}. (7.1) 


Topologically, GL(n, R) is equivalent to R” and has the structure of an 
n-dimensional differentiable manifold. The map det : GL(n,R) > R 
is continuous. The inverse image of 0 under this map is a closed space 
and GL(n, R) is the complement, so GL(n, R) is an open subset of R”? 
and thus it is not compact. GL(n,R) is not connected being the union 
of two disjoint open sets defined by whether det(A) > 0 or det(A) < 
0. The connected component GL+(n, R) corresponding to det(A) > 0, 
contains the identity. Furthermore, if det(A) > 0 and det B > 0, we have 
det(AB) > 0 and det(A~) > 0, so GL*(n,R) is a (non-compact) Lie 
subgroup. 


. The subset of GZ*(n,R) with the restriction det(A) = 1 is called the 


special linear group SL(n, R). This is also a non-compact subgroup. 


. The complex general linear group is the set of n x n matrices, 


GL(n,C) = {A € Mnxn(C) : det(A) # 0}. (7.2) 


The subgroup of matrices A € GL(n,C) with det(A) = 1 is called the 
special linear group SL(n, C). 


. The real orthogonal group is the set of n x n real matrices 


O(n, R) = {A € Mnxn(R) : A7! = AT}. (7.3) 
The condition A~! = AT is equivalent to AAT = ATA = I. We have the 
following 
a) IT = I} = I, so I € O(n, R). 


b) If A,B € O(n,R), then (AB)(AB)? = ABBTAT = AAT = I, so 
AB € O(n, R). 


7.1. 


LIE GROUPS 235 


c) If A € O(n,R), then A~1(A7!)? = AT(AT)? = ATA =]. 

Hence O(n, R) is a Lie subgroup of the GL(n, R). The map T(A) = AAT 
is continuous and O(n, R) = T~!(L), so O(n, R) is closed. 

If we denote by e, the kth column vector of A, then the matrix element 
of AT A in the jth row and kth column is given by 

Je 


(ej ek =< €j, €k >= Ojk: 


Thus, the columns (and rows) of an orthogonal matrix constitute a set of 
orthonormal vectors. There are n column vectors, so under the Euclidean 
norm of R” , we have ||A|/? = n. That is, elements of O(n,R) lie on 
a sphere gel so the set is bounded. By the Heine-Borel theorem, 
O(n, R) is compact. By equation 1.57 and the subsequent Theorem, we 
can characterize the orthogonal group as the set of linear transformations 
that preserves the standard metric g = diag(+1,+1,---+ 1) in R”. 


. If a matrix A is orthogonal, then the condition AA? = I implies that 


det(AA7) = det(A) det(A’), 
= det(A)? = det (1), 


so det(A) = +1. We define the (real) special orthogonal group SO(n, R) 
to be the subset of O(n, R) of orthogonal matrices A, with det A = 1. 
SO(n, R) is a compact Lie subgroup of dimension ¿n(n — 1). 


. The unitary group is the set if n x n matrices 


U(n) ={A € Maxn(C): A+ = AŻ}, (7.4) 


where At is the Hermitian adjoint. This is the complex analog of the 
orthogonal group. The condition AT! = At is equivalent to AA’ = J, 
which implies that det(A) = +1. The subgroup of unitary matrices A 
with det(A) = 1 is called the Special Unitary group SU(n); it is a compact 
group of dimension n? — 1. 


. Let {M,g} be the pseudo-Riemannian manifold R” with a type (p,q) 


metric with signature g = diag(1,1,---—1,—1...). The group of trans- 
formations preserving this metric is called O(p,q). If in addition, the 
matrices A € O(p,q) are required to have det(A) = 1, the group is called 
SO(p,q). These groups are not compact. The special case L = O(1,3,) 
is the group of transformations preserving the Minkowski metric. This 
group is called the Lorentz group which, is central to relativistic physics. 


. Ina completely analogous manner, let { M, g} be the pseudo-Riemannian 


complex manifold C” with a hermitian metric g = diag(1,1,---—1,—1) 
of type (p,q). The group of transformations preserving this metric is 
called U(p,q). If in addition, the matrices A € U(p,q) are required to 
have det(A) = 1, the group is called SU (p,q). These groups are also not 
compact. The special case SU (2,2) is isomorphic to the Poincaré group 
and is of interest in twistor theory. 


236 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


9. Consider the space F?”, where F stands for the reals R, the complex C, 
or the quaternion H algebras. Let (q',...q",p1,---Pn) denote the local 
coordinates. In the case F = R, we may think of the local coordinates as 
representing position and momenta. Let Q be the non-degenerate skew- 
symmetric two-form 


Q = dÉ Adp;, i=1...n. (7.5) 


The symplectic group Sp(2n, F) is the group of transformations preserving 
the symplectic form Q. The tensor components of 2 in standard basis are 


given by 
Q= | E 4 l (7.6) 


where In is the identity nxn matrix. Then the symplectic group is defined 
by 
Sp(2n, F) = {A € Monxan(F) : ATQA = Q}. (7.7) 


The symplectic group is an essential structure in the differential geome- 
try description of Lagrangian and Hamiltonian mechanics. In the simplest 
case in which F = R, and n = 1, the components of the canonical sym- 
plectic form is the complex structure introduced in equation 5.50 in the 
context of conformal maps. It is immediately clear that Sp(2, R) consists 
of all 2 x 2 matrices A with det(A) = 1, so Sp(2,R) = SL(2,R). The 
symplectic groups are simply connected but not compact. We define the 
compact group, 
Sp(n) = Sp(2n, C) N SU(2n), 


that is, the space of all complex symplectic matrices which are also el- 
ements of the special unitary group. Sp(n) can be identified with the 
quaternionic unitary group U(n,H). In particular, S'p(1) is the set of 
unit quaternions and Sp(1) ~ SU(2) is topologically a three sphere 9°. 
More details on this topic appear later in the discussion of quaternions, 
starting with equation 8.15 


7.1.1 One-Parameter Groups of Transformations 


In this section we formalize the notion of flows of vector fields mentioned in 
definition 1.1.13. The concept of flows of vector fields permeates all of physics. 
The classical description of magnetic fields illustrates this well. Consider for 
example the Earth’s magnetic field. At any point around the planet, one as- 
sociates a vector and a direction for the magnetic field at that point. If one 
picks any such point and follows in an infinitesimal trajectory along the earth 
magnetic field vector, one arrives at new point with corresponding field vector. 
Iterating the process, one obtains an integral curve on which the vector field 
restricted to that curve is tangential to the curve. Doing this at all points in a 
neighborhood of the point then gives rise to a family non-intersecting integral 
curves that we usually call the magnetic field lines. Magnetic field lines traced 
by iron filings around a laboratory-grade magnetic sphere, give a geometrical 


7.1. LIE GROUPS 237 


rendition of the direction of the magnetic field at any point. The converse 
notion of flows is also intuitively clear as shown in figure 7.1. If one has a non- 
intersecting family of curves on a neighborhood of a point on a manifold, one 
would expect that the tangent vectors to the family of curves would constitute 
a vector field. Students acquainted with more advanced classical physics will 
know that state space of a dynamical systems is equipped with a real value 
function H called the Hamiltonian, which effectively represents the energy at 
each point. Hamiltonian mechanics is then formulated in terms of a symplectic 
structure that associates with H, a Hamiltonian vector field Xy whose integral 
curves correspond the solutions of the equations of motion. For a rigorous and 
elegant treatment of this subject, see Abraham-Marsden [20]. 


7.1.4 One-Parameter group of diffeomor- 
phisms 

Let U C M be an open subset of an n- 
dimensional manifold, p € U, and let Ie = 
(—e,¢€) with e > 0 be an open interval in R. 
A one-parameter group of diffeomorphisms is 
a smooth map, 


Yr: le x U > M 
(t, p) > plp), |¢| <€, 


Fig. 7.1: Integral Curve 


with the following properties. Suppose that 
t,s € R, with |t], |s| < e, |s + t| < €, and 
$s (P), O¢(p), ds44(p) € U, then 

a) Ps O Pt = Ps+t, 

b) p, (p) = p for all p € U. 
The map y; is clearly a local diffeomorphism with inverse function given by 
(p+)! = yz. Now consider a vector field X such that X, = yj(p) at each 
point p = y,(p), as shown in figure 7.1. If f : M — R is a smooth function, 
then the action X on f as a linear derivation is given by the push-forward 
formula 1.25 


Xp(f) = vi(p)(f), 
E (lp 


= ao o Y(P))|t=0 


Conversely, if X is vector field given in local coordinates x” in a neighborhood 
of a point p, given by, 


o 
=l = 
X =v Aen p=l..n 
then, for a curve y;(p) with initial condition that yo(p) = p, to have X, as a 
tangent vector, it must be the case that, 
,_ dx? ð 
Pt = Oe Oak 


238 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


Thus, we are led to a system of first order ordinary differential equations, 


drt! 
T =v (at... g”). 


subject to the condition z” o y,(p) = v” (p) By the existence and uniqueness 
theorem of solutions of such systems, for sufficiently small e, there exists a 
unique integral curve that satisfies the equation on Ie = (—e,¢€). We conclude 
that smooth vector fields can be viewed as generators of infinitesimal groups of 
diffeomorphisms. The one-parameter group of diffeomorphisms y; is also called 
the flow of the vector field X. 

At this point it is worthwhile to review the notion of the push-forward of 
a vector field first introduced in 1.11. Let y : M — N be a smooth manifold 
mapping and X € X (M). Ifg: N > R is a smooth function, the push-forward 
of X is a vector field Y = p, X € X(N) defined by 


(pxX)(g) = X(go 9p). 


More precisely, if p € M, 


so that, 
Y(g)op=X(gop).. (7.8) 


Suppose that in addition, y is a diffeomorphism, let € be a one-parameter 
group associated with a vector field X € X (M). Then we can push-forward to 
a one-parameter subgroup 7; associated with y,X in N given by 


Y= poogt, (7.9) 
as illustrated in the commuting diagram, 
M N 
&l vd 
M N. 
In other words, under a diffeomorphism y, the integral curves of a vector field 
X are mapped to the integral curves of y, X. 

Diffeomorphisms also allow us to use the inverse of the push-forward to 
pullback vectors, and push-forward functions with the inverse of the pullback. 
Specifically, if f € F#(M) and Y € X(N), 

Y) =Y) =Y op), (7.10) 
pf =fop (7.11) 


which we can write, 


PY (F) =Y (xf). (7.12) 


7.1. LIE GROUPS 239 


When y is a diffeomorphism, one may extend the pullback to tensor fields. 
Suppose M is an n-dimensional manifold. Since M locally looks like R”, on 
a coordinate patch on a neighborhood of point p we can pick an orthonormal 
basis {e1 ... , en} for the tangent space, with dual basis {0',...,0"}. Referring 
back to section 2.2.1, a tensor field T € Zf (M) is a section of the bundle of 
tensor products of the tangent and cotangent spaces. In the given basis the 
tensor is an expression of the form, 


T= Tiie 84, 06% @ 86H) (7.18) 


The tensor components are defined as, 


Tr tr wi 8 (Oe 00, ej.: ej): (7.14) 


J1-Js 
Let t be a tensor field on N. The generalization of the pull-back 2.68 is given 
by, 


(y*t)p(O,... O°, ejin Ej) = tolp) (pr 0, tees pr o, Preja s- Preja) 
(7.15) 
Once again, the fancy equation can be demystified as just a generalized ver- 
sion of the chain rule. If the diffeomorphism is given in local coordinates by 
y” = f¥(x”), so that e, = ĝ/ðx" and 6* = dz", equation 7.15 is the classical 
transformation law for tensors, 


*a\ii in _ ytt Oy'™ Əx Oa!s 4ky...ky 
(pe SOE OUE BER ce pike, (7.16) 


The matrices 0x” /Oy" are allowed because y is a diffeomorphism, so the Jaco- 
bians are invertible. Of course, we can pull-back (or push-forward) vectors and 
tensors if ¢ is a local diffeomorphism of M into itself. 

Perhaps this is an appropriate time to generalize the coordinate-free exte- 
rior derivative formula 6.28 to arbitrary forms. Let A*(M) be the bundle of 
alternating covariant tensors of rank k and denote the sections of the bundle 
by 0*(M). As in 2.3, sections of this bundle are called k-forms on M. 


7.1.5 Definition Let w be a k-form on M, and let {X),...Xp41} E X (M). 
The exterior derivative of w is the (k + 1)-form given by, (See, for example, 
Spivak [34]) 


k+1 
ds(X1,...Xp41) = (CDX: (W(X, Ki. Xeta) 
w=1 
AODH w([Xi, Xj], Xi, eins Xa ets K eens ,Xk+1). 
i<j 


(7.17) 


where the “hats” mean that these vectors are excluded. Thus, for a 2-form w, 
the formula gives 


dw( X,Y, Z) =X (w(Y, Z)) — Y(w(X, Z)) + Z(w(X,Y)) 
—w([X, Y], Z) + o([X, Z], Y) — w([Y, Z], X). 


240 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


If one chooses local coordinates as in section 2.3, we find that the operator is 
consistent with 2.65, and satisfies the properties in 2.66 and 2.69. In particular, 
the following holds: 


7.1.6 Theorem Let M, N be a manifolds, a € Q*(M), 8 € O'(M), and 
p: M —> N bea diffeomorphism as above. Then 

a. dod=0, 

b. d(a ^ B) = da ^ B + (-1)Fka ^ dê, 

c. p“ (a A B) = p*a A g*ß, 

d. y* (d w) = d(y*w). 


7.1.2 Lie Derivatives 


Let y+ be the one-parameter group of diffeomorphisms generated by the 
integral curves of a vector field X. We can use the pullback and the push- 
forward of p; to define a rate of change of functions, forms and vector fields in 
the direction of X. If f € #(M) is a smooth function on M, and p € M is a 
point on M, we define the Lie derivative of f with respect to X by 


£x,f = Sgt) 


t=0 
— tim £° Pe) — FO) 
t0 t 
= X,(f), (7.18) 
= df (X)|p- (7.19) 


which as expected, is just the directional derivative. If w € Q'(M) is a one 
form on M, we define the Lie derivative of w at p by 


(Exo) = Elei 


t=0 
— lim P) — wl) 
t30 rf 


This certainly has the right flavor of a derivative, namely, we pullback the form 
from a nearby point, compute the difference quotient, and then measure the 
infinitesimal change by evaluating the limit as t goes to 0. We will compute a 
formula for £ xw a little later in this section. In a similar manner, if Y € X (M) 
is another vector field in M, we define its Lie derivative along X by using the 
push-forward 

. Yp — (tY )p 

Cu 
where (pY )p = Pir (Ypi) That is, take the vector Y at y7 '(p) = y_i(p), 
push it forward to p and compare the infinitesimal change with Y,, as shown in 
figure 7.2. Since y; is a diffeomorphism, we can pullback vectors by the inverse 
of the push-forward, so we could equivalently define £xY in manner that looks 


7.1. LIE GROUPS 241 


Yo) (PuY)p 


Fig. 7.2: Lie Derivative 


more like the definition for functions and forms 


d 
£x,Y = HY) (7.20) 
t t=0 
d, 4 
= — A 7.21 
are? m (7.21) 
yt Me ere 
= jim Pe Yam = Yo (7.22) 
t0 t 


In fact, since formula 7.15 shows that one can pull-back tensors in the case of 
a diffeomorphism, the Lie derivative can be extended to a linear derivation on 
the full tensor algebra. 


7.1.7 Definition Let X be a vector field in M, and y;(p) be the one- 
parameter family of diffeomorphisms generated by X, let T € J? be a tensor 
field. The Lie derivative of the tensor T with respect to X at p is defined as, 


d * 
£x,T = Te T) 


t=0 

d 

— OT = GF £xT. 7.23 
dt pi X ( ) 


That the second version of the definition follows from the first, can easily be 
established by a quick computation 


d * d * 
gaT > Eleru) 


Ae pk _ ok 
5 = (RET) = 6 ExT 


s=0 


The operator £x : Jy — FJ is clearly linear and satisfies Leibnitz rule. We 
have the following important theorem, 


7.1.8 Theorem £xY =[X,Y]. 
Proof Consider the function f o +. The Taylor expansion about the point p 
gives 


fop = (Fo aO +t SI oal] +o), 
t=0 


= f(p) +tXp + A(t”). 


242 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


Let f € F(M). From the definition of the Lie derivative, we have 


£xY = lim = 
Applying this to f at p, we get, 
£x,Y(f) = lim > 0") (p), 
p 
YP) Yoril © pt) 
=i i 
ji Y (f) Y,-1(p) P) +tXp) 
> $50 t ; 
YVC) — Ypi) 
= im | a OA, 
= Xp(Y(f)) — Yo(X(f)), 
= [X,Y],(f) 


7.1.9 Theorem If y:M —> N is a diffeomorphism and f € #(M) is any 
smooth function on M, then, 


Lox (xf) = (Ex) (7.24) 
Proof 


£o,x(Oxf )lotp) = 2X lpg 2 p7"), 
=X(foy*cog)(p), 
= X(f)(p), 
= X(f) Oe? lees, 
£o. x (psf) = 9x (Lx f). 


7.1.10 Theorem Ify: M —> N is a diffeomorphism and X,Y € X (M) are 
vector fields on M, then, 
Lox (PY) = p(£xY); that is, 
[p X, pY] = gel X,Y]. (7.25) 
Proof Let g€ F(N). By equation 7.8, we have to show that [y,X,y.Y](g) © 
4 = [X,Y |(g o p). We have, 
[PaX, pY ](9) 0 p = [p X, pY], 
= (PX (PY (g)) © p — (pY (94X (9)) © p, 
= X (PY (9) 0 p) — Y (pX (9) © p), 
= X(Y (g 0 p)) — Y(X (9 0 )), 
= [X,Y](g° 9). 


7.1. LIE GROUPS 243 


The Lie derivative satisfies the following, 


7.1.11 Properties. Let f € F, X,Y € X, and Tı, Tə be tensors. Then 
a) £xf=X(f), 
b) £xY = [X,Y], 
c) £x(fY) =X(f)Y + fExY, 
d) £x(Ty @ Tə) =L£x7T,@7,4+7, 8 £xTo. 
e) £x(C(T)) = C(£xT), where C : Z£ > ZE is a contraction (See ??). 
So, if w is a one-form, we have, 


Lx (wlVY) = (£xwlY) + Wi£xXY). 
Consequently, the Lie derivative of a one form is given by, 
(£xwlVY) = £x(wlY) — (w, £xY), 


£xw(Y) = £x((Y)) —w(£xY), 
X(w(Y)) — w([X, Y]) (7.26) 


Now that we know the Lie derivative of functions, vector fields, and one forms, 
it is a straight-forward exercise to use induction on tensor products, to find the 
formula for the Lie derivative of any tensor field T € Jy. The formula is, 


£x[T(wt,... geo” X1,... Xe) SEL (Ws, 0" Ky BG) 


+Y Tut... Bre jen Re Xa) 


i=} 
(7.27) 
If we set X = X*Q,, the formula in component form reads 
iir _ yk itir 
ExT = MOOT GF. 
i1\ pk ia... ig\pitk iz... 
(Ope jE On ee Rae 
— (Bj. X TRG? OR e Te (7.28) 


In a Riemannian manifold {M, g} with Levi-Civita connection V, the formula 
above for the components of the Lie derivative is not manifestly covariant, but 
it becomes so by replacing the 0;’s by the covariant derivative Vg. That is, 


oe =K NTAS 


J1---Js J1---Js 
i k 12.2.4, i 11k i3...4, 
PV GAO EE gags + ANE) Tiaa tee 
Se Ee ee T EE (7.29) 


One can verify directly, that all the extra terms with connection coefficients 
cancel out, and the formula reduces to the previous one. If the components of 


244 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


the Riemannian metric are given by guv = g(O,,0,), and X = X°0,, we have 


LX Gy => X?°V Guy + (Va X? )gov + (ViX") gue; 
=V, X, + Vi Xp 


A vector field X that satisfies the equation 
£xg=0 (7.30) 


is called a Killing vector. If p+ is the one-parameter subgroup corresponding to 
the flow of a Killing vector, the solutions of the Killing equation 


Va X +V, X, =0 (7.31) 


represent isometries of the manifold. In Minkowski space, the Killing vector 
fields correspond to the generators of the Lorentz group discussed in section 
8.2. Let a(t) be a geodesic in the manifold with velocity vector V given in local 
coordinates by V = V“0, = t” (t)ð,, and suppose that X = X” ð, is a Killing 
vector, then, 
Vv(VEX,) = VV (V" X), 
= VVV Xp + V”X"V V, 
= IV“V” (V, X, SPM phe) RPV IV Vn, 


=0 


The first term vanishes because X is a Killing vector and the second because 
V is geodesic, so that Vy V = 0. Thus, the metric < V, X >= V"X,, is a con- 
served quantity along the geodesic. Roughly speaking, the conserved quantity 
associated with the local isometry is the momentum, which makes sense, since 
a free particle travelling along a geodesic is not subjected to external forces. 


7.1.12 Theorem If X € 2(M) andwe€*(M), n € (M), then, 


d£xw = £xdu, (7.32) 
£xlw nn) = £xw A £xn. (7.33) 


Proof Let p, be the one-parameter flow of X. By definition, 


Exw(p) = F(ytw())lemo 


The theorem follows immediately from the fact that d is linear and so, it com- 
mutes with d/dt, plus the already established formulas yjdw = dy;w, and 


pi lw An) = vw A gin. 


We recall the definition if the interior product 2.23. Let M be a manifold, 
X,X,...,K, € &(M) and w € 0**1(M), then, 


ixw(X1,...,Xk) = w(X,X1,...,Xr)- (7.34) 


7.1. LIE GROUPS 245 
By convention, we set ix f = 0. If w is a one-form, ixw = w(X). 
7.1.13 Theorem Let y: M —> N bea diffeomorphism. Then, 

y“ (ixw) = ip x*w. (7.35) 
Proof Let {Y = y*X, Y; = y*X;,i=1...k} € X (M). We have 


typ*w(%,..., Yk) = p*w(Y,N,..., Yk), 
= wh PY, Pe V1,---5 PX Yk), 
HDG ac Xi), 
=ixw(X1,...,Xz) 
ipa xp“ W(Y1,..-, Yk) = p*ixw(Ni,..., Yk), 


If y; is a local diffeomorphism on M the theorem can be restated as, 


Yr O ix — tyr X O pre (7.36) 


7.1.14 Theorem Let ix :0*(M) > 0'-!(M), be the interior product and 
let ae QF(M), BE (M), f € F(M). Then 

a) ix(aA B) =ixad B+ (-1)Fad ixB, 

b) ifxa = fixa, 

c) ixdf = £xf = X(f) 
Proof The proof of part (a) involves some combinatorics arising from the 
definition of the wedge product 2.61. We leave out the somewhat messy details. 


Part (b) follows immediately from the multilinearity of a and the definition of 
the interior product. Part (c) is trivial. We have ixdf = df(X) = X(f). 


The interior product 7.34, the intrinsic exterior derivative 7.17, and the Lie 
derivative 7.27 are related by the following formula. 


7.1.15 Theorem (H. Cartan) 
doix tixod=€£x. (7.37) 


Proof The proof is by induction. First, we verify that the formula is true for 
zero-form f and a 1-form w. 


(dix +ixd)f =dixf +ixdf, 


= df(X) =X(f) = £xf. 


246 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


Let w be a one form and X,Y € X (M). 


(dix +ixd)w(Y) = (d(ixw))(Y)) + ix(dw)(Y), 
=e (X))(Y) + du(X,Y), 
Y(w(X)) + X@(y)) — Yw(*)) - HX, Y]), 
X(w(Y)) — w(£xY), 

= f yu). 


Suppose the proposition is true for k-forms w;. A k + 1-form can be written as 
df; Awi that for simplicity we write as df Aw. By induction hypothesis, 


dixwtixdw = £xw 


Using Leibnitz rule, and the properties dod = 0, ixf = 0, ixdf = £xf, we 
compute, 
(dix +ixd)(df Aw) = dix(df Aw) +ixd(df \w)), 
= d(ixdf Nw — df Nixw) —ix(df A dw), 
=d£ixfAw+£xf Adw— (—df \dixw)) 
— (ixdf \ dw — df ^ ixdw), 
=d£xf Nw+ £xf^dw-— £xf^dw+df A (dixwtix dw) 
= £xdf Nw+df ^ £xw, (by induction and £x d =d £x), 
= £x(df Aw), 


which is what we wanted to establish. The diagram in figure 7.3, which is 
reminiscent of chain-complexes in singular homology, helps to visualize this 
most elegant result, sometimes called Cartan’s magic formula. 


ss p M . gt __™ e E 


’ T E A ee 
£ x| HA £x | : aa £x | 
Kx K d 


oF 1 4 ok 2 o d., 


Fig. 7.3: Cartan’s Magic Formula 


Cartan’s magic formula is useful in establishing the Poincaré lemma. We 
recall from example 2.82, that a closed form need not be exact. In more general 
spaces, such as spheres, there are topological considerations. In fact, Consider 
the de-Rham complex, 


5 OF) es (7.38) 


In algebraic topology, a sequence such as this one, for which do d = 0, is called 
a long exact sequence. If one lets Z*(M) be the set of closed k-forms on a 
manifold M, and B*(M) be the set of exact forms, the quotient 


H*(M) = Z*(M)/B*(M) 


7.1. LIE GROUPS 247 


is called the k-th cohomology group H*(M) of that space. Two closed k-forms 
w and w’ are in the same cohomology class if their difference is exact; that is, 
there exists a (k — 1)-form ¢, such that 


w =w + dọ 


The de-Rham cohomology groups have deep connections to the topology of the 
space. A key topological concept we need is that of a homotopy which we define 
as follows 


7.1.16 Definition Lef M’ and M be smooth manifolds. Two smooth maps 
f,g:M’ — M are called homotopic if there exists a map ¢: M’ x [0,1] > M, 
such that 


a) o(p',0) = F(p), 
b) o(p’, 1) = g(p 


). 
If we let bel ‘) = d(p’,t), t € [0,1] then the homotopy describes a smooth 
deformation of g = ¢; to f = ¢ 9. A manifold M is contractible, to a point 
po E€ M, if there exists a homotopy 


¢:M x [0,1] + M 


for which ¢,(p) = g(p) = p is the identity map, and o(p) = f(p) = po is the 
constant map. 


7.1.17 Poincaré Lemma 

If M is a manifold which is smoothly contractible to a point, a closed form w 
is exact. 

Proof We present a proof in the special case in which the manifold is a ball 
B” €e R”+! centered at the origin. This is the alternative proof that appears 
in Abraham-Marsden [20]. For all p € B”, and 0 < t < 1, let ¢ be the 
one-parameter group of diffeomorphisms defined by ¢;(p) = tp. This kind of 
homotopy map is an example of a deformation retract. When t = 1, the map 
is the identity map and as t > 0, the ball continuously shrinks to a point. Let 
X; be the vector field X;(p) = p/t. We have, 


so X; is the tangent vector field of the one-parameter family of curves. Let w 
be a closed k-form in B”. By the definition of the Lie derivative and Cartan’s 
magic formula 7.37, we have 


d 

1 (oi) = gi Exu), 
= p; (dix,w + ix dw), 
= Qi (dix,w), 


248 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


Integrating from a small value e€ > 0 to 1, we get, 
1 
ioli = f di (dix,w) dt, 


T 
zi o; (tx,w) dt, 


Taking the limit as €e — 0 and recalling that at t = 1, ¢; is the identity map, 
we get 


1 
w= dB, B= | glx) dt (7.39) 


Although the theorem is proved, it is helpful to find a more explicit formula for 
the (k — 1)-form £, using the definitions of the push-forward and the interior 
product. Let {e;,...,e,%—-1} be part of a set of basis vectors and p the position 
vector at the point p. Then 


1 
Bp(€1,---,€k—1) =} Připw (zy -.- e€k—1) dt, 
0 
1 
= 1 Wo(p) (P, Px€i--->Px€k—1) dt, 
0 


1 
=} Wtp(p, ter, ..-,tek—1) dt, 
0 
so, 
1 
Bp = 1 t we (p, €1,---€k-1) dt. (7.40) 
0 


The more general theorem is computationally more complicated, but the idea is 
essentially the same. There is a natural one parameter group of diffeomorphisms 
pı : M x [0,1] such that ¢, is the identity and ġo is a constant map. One then 
seeks a linear map h : QF — Q*—! such that +, 


piw — pow = d(hw) + h(dw). (7.41) 


This property can be represented by the diagram 7.4 which is an example 
of a chain homotopy. The linear map we seek can be obtained by defining 
hw(p) = Bp. By choice of the homotopy map, the left-hand-side of equation 
7.41 is the identity map on forms. By a direct, but non-trivial computation 
(see [20], [34]), one can verify that the right-hand-side of equation is also equal 
to w. Thus, if dw = 0 we have 


w = d(hw) 


which is what we wanted to establish. 
This general form of the theorem is rough for the novice. In Abraham- 
Marsden, the form hw = 6 just pops out of nowhere, so without the alternative 


lIt is common to index the maps with the order of the forms. For example, one would 
write hë : QF + OF-1, and dé: QF = OFT, 


7.1. LIE GROUPS 249 


„2y gr} ds. yk A, oid ©. a 
h h 
pi— po 
8, gi a ok a, ert _#, 


Fig. 7.4: Chain Homotopy 


proof, the whole process appears rather mysterious. Spivak provides some mo- 
tivation for finding hw by first treating the case where w is a one form. In his 
book Calculus on Manifolds he carries out the full computation when M is a 
star-shaped region in R”. Either way, the computation is more difficult. The 
Poincaré lemma is often stated by saying that in a manifold M, a closed form 
is locally exact; that is, given a closed form on an open set U C M, then for 
each point p € U, there exists a ball B C M centered at p in which the form 
is exact. Perhaps an explicit construction of the form hw in R? might help 
the reader understand the nature of the constructive proof. Let B be a vector 
field with V - B = 0 and p = (z,y,x). We seek a vector potential A, such that 
V x A = B. Write p as a tangent vector p = x*0;, and map B into the 2-form 


w = By(p) dx? ^ dz? — Bo(p) dx! ^A dz? + B3(p) dx! A dz’. 


The components of the 1-form a = Al dx constructed in the proof of the 
Poincaré lemma are 


Aj, = hw(p)(Oz"), 
1 
= 1 twp (x* Ox, O;) dt, 
0 
1 
= f t|Bi (tp) dx? ^ dz? — Bo(tp) dx A dx? + B3(tp) dx! ^ dx?|(x" dp, Oj) dt. 
0 


Since dx*(0;) = ô}, we get 


t[a? Bo(tp) — x? B3(tp)] dt, 


tes 
Í 


tla Ba (tp) — x° Bı (tp)] dt, 


D 
om 
II 
È 


t[x? By (tp) — x Bo(tp)| dt, 


= 
II 
m an a 


For an example let B = (y,—2z?,—x) and see if we can recover the vector 
potential A = (xy, —yz,xz*) which we used secretly to produce the field. The 


250 CHAPTER 7. GROUPS OF TRANSFORMATIONS 
computation yields 
A = feces) — ty(—ta)] dt = —4$2° + kay, 
0 
A S= [texte — te(-#27)| dt = ay" + 422”, 
AL = [ ote) — ta(—t?2?)| dt = hy? + 422°. 


One can easily verify the curl of the resulting vector potential 


dig OI On Tha 8 2 
H ILL, 3Y gee) 


A! = (-42°4 tey, —3Y 


does indeed give the same field B, but we failed to recover the original potential. 
On the other hand, the difference 


A-A’'=Vf, where f= eu7y + 5x28 = Re 2, 


so the two potentials are cohomologous, as expected. This academic example 
shows the cleverness of the definition of hw but it also squelches the hope of 
an easy way out of solving Maxwell equations for magnetic fields. To obtain 
the vector potentials in the right gauge on problems of physical significance, 
students are much better off reading the Feynman Lectures on Physics. 


7.1.18 Theorem 


[£x,iy]=f£x0 ly -—iyofx = uUx,Y]> (7.42) 
[£x, fy] = £x o £y — £y o £x = £ix,y). (7.43) 
Proof Let w be a k-form and Y, X,X1,...,X,—1 be vector fields. The proof 


is by direct computation using the formula for Lie derivatives 7.27 and the 
definition of the interior product 7.34. 


(iy £xw)(Xı, see ,XK-1) 
_ (£xw)(Y, Xi, tae ,Xk-1), 
k—1 
= £x(W(Y, Xi,- -, Xr-1)) — XC w(¥, Xi,- -0 [X, Xil, -.. Xr-1) 
i=1 
—w(|X, Y], X1,...¢%-1), (term not included in sum) 


kzi 
= £xiy(Xı,.-., Xp) + So iy (Xi, oat X, Xi], ---Xk-1) 
j=1 
rN ieee Aes sey Xk-1), 
e (£xiy = itx,y])w(X1, see ,Xk-1). 


We leave the second part as an exercise, using the formula of Cartan 7.37. 
A direct consequence of equations 7.43 is that if X and Y are Killing vector 
fields in a Riemannian manifold {M, g}, so that £xg = £yg = 0, then [X,Y] 


7.2. LIE ALGEBRAS 251 


is also a Killing vector field. Thus, the set of Killing vector fields forms a Lie 
subalgebra of the Lie algebra of vector fields, in the sense described in the 
section that follows. 


7.2 Lie Algebras 


7.2.1 Definition A vector space g over a field F (here, F = R or C) is 


called a Lie algebra if there exists an operation |, ] : g x g > g (called the Lie 
Bracket), such that, 
a) [, ] is F-bilinear, 


b) [X,Y] = —[Y, X], for all X,Y € g, 
c) [X,[Y, Z]] + [Y,[Z, X]] + [Z, [X-Y]] =0, Jacobi identity 


7.2.2 Definition A Lie subalgebra h of a Lie algebra g is a subspace that is 
closed under the Lie Bracket. 


7.2.3. Theorem The vector space 2 (M) together with the operation £xY = 
[X,Y] gives the space the structure of a Lie Algebra. 
The Jacobi identity follows directly from equation 7.43 applied to vector fields, 
but a much more elementary proof follows directly from the definition of the 
bracket of two vector fields and the property of vector fields being linear deriva- 
tions on the space of functions. The details of the proof have already appeared 
in theorem 4.4.2. 

Lie subalgebras are intricately connected with the theory of submanifolds 
N C M. They can be used to generalized the idea of integral curves. Let 
M be an n dimensional manifold, and p be a point p € M. A k-dimensional 
distribution at p is a subset Dp C TpM. Let {X1, X2,..., Xp} be a set of 
linearly independent vector fields in a neighborhood U of p which constitutes a 
basis for D,, q E€ U. If these can be chosen in a smooth way, D is called a C° 
distribution. 


7.2.4 Definition Let {X1, X2,..., Xk} | p E U} span a distribution D. The 
distribution is integrable if the vectors form a subalgebra of the Lie algebra of 
vectors fields in M. That is, there exist C™ functions Ca, such that 


[Xi X;] = Coxe 


7.2.5 Definition A distribution D arises from a foliation of M, if for each 
point p € M, there exists a k-dimensional local submanifold N of M, containing 
p with 

ix (TN) = Dp, for al p € U 
where i : N — M is the inclusion map. 


Locally, foliations look like layers of local submanifolds, called the leaves 
of the foliation. Perhaps the most famous foliation is the Reeb foliation of $° 


252 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


that locally looks like onion layers formed a sock pushed infinitely into itself. A 
standard method to treat the field equations in general relativity is to think of a 
4-dimensional Lorentzian manifold as a 3+1 manifold, foliated by 3-dimensional 
spatial surfaces evolving in time. 


The main result on distributions is, 


7.2.6 Theorem (Frobenius) A distribution D is integrable if and only if, it 
arises from a foliation. 


An alternative formulation of the Frobenius integrability theorem can be stated 
in terms of differential forms. Let Q*(M) be the graded ring on smooth differ- 
ential forms. The subring Z(D) of all the forms w that annihilate D, namely, 
for all X,,...X, in D 

w(Xı,... Xp) =0, (7.44) 
generate an ideal Z(D). If {e1,...,e,} is an orthonormal frame in D, then 
the dual forms {61,...,0*} span Z(D). The differential forms version of the 
Frobenius says that D is integrable if for every w € ZI(D), we have dw € 
T(D); that is, the differential ideal is closed under exterior derivatives. More 
specifically, there exist forms afp, such that 


do) = afp A0" € T(D). 
The connection between the two versions of the theorem is achieved through 
the structure constant formula, 
do’ = LC 0i NOX 
= 5 Cab A 


which we prove in equation 7.67. 

If we extend the orthonormal frame spanning D to an orthonormal frame 
{e1,..-, €k; €k41;---;€n}, With dual forms {61,...,0*,0*+!,...,0"} then the 
integral submanifolds are defined by the (Pfaffian) system, 


0” =0, m=k-+1,...n. 


The theorem guarantees the existence of local coordinates {x',...,2”} about 
p € U, with tangent vectors {0/0z!,...0/dx*}|, spanning Dp, and with the 
dual forms 

fae cide 


annihilating D. The forms {@*+1,...,0”} can then be written as linear combi- 
nations of the coordinate one-forms above. The sets 


NS ah Sah ana Saal}, 
where the a’s are constants, are integral submanifolds of the distribution Dp. 


We do not present a proof of this very important theorem in this rather 
perfunctory treatment of the topic. Instead, we refer the reader to classic text- 
books such as [20] or [34]. The theorem is the starting point to the deep subject 


7.2. LIE ALGEBRAS 253 


of Lie groups of symmetries of partial differential equations and prolongation 
theory. 


Given any Lie group G, we can construct an associated Lie algebra g. Given 
a group element g € G, we define the left and the right translation maps Ly, Rg : 
G >G by, 


Lg(go) = 990, (7.45) 
Rg(9o) = 909, for all go € G. (7.46) 


Hereafter, for every statement we make about left translation, there is a corre- 
sponding statement about right translation. The map L, is a diffeomorphism 


with inverse given by (Lg)™* = Lg-1. 


X(C) = X (G) 
x] x] 
f— 4 


Fig. 7.5: Left Invariant Vector Field. 


7.2.7 Definition A vector field X € X (G) is called left invariant if for all 
g € G, we have, 
La X =X. (7.47) 


More specifically, if X,, is the tangent vector at go, then, 
Lg.Xq, == XL 4(g5) = Xag,» 
is the tangent vector at ggo. 


7.2.8 Definition Let g = L(G) be the set of all left-invariant vector fields 
in G. Then g has the structure of a Lie algebra. 
Proof Let X,Y € g. Then by equation 7.25 we have, 


Lg |X, Y| = [Lg X, LgY], 
= [X,Y], 


so [X,Y] € g. Thus, the set of left-invariant vector fields is closed under Lie 
brackets and hence it is a Lie subalgebra of the Lie algebra of all vector fields 
X(G). 

Let T.G be the tangent space at the identity of a Lie Group. For every 
tangent vector Xe € T.G we can generate a vector field X by simply defining 


254 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


the value of the vector field at any point g € G, to be the tangent vector, 
Xg = Lgs Xe. (7.48) 


The vector field X so defined is almost tautologically left invariant. Indeed, if 
Jo E G, 


Lg.Xg, = Lgx (Lg, x) Xe 
= Ligga) Xe 


For each left invariant vector field X with value X,, there is a unique tangent 
vector Xe = L(g-1)xXg at the identity, so we have the following theorem, 


7.2.9 Theorem The tangent vector space at the identity T.G of a lie group 
is isomorphic to the Lie algebra of left-invariant vector fields g = L(G). The 
isomorphism is obtained by assigning to any left-invariant vector field, its value 
at the identity. 

We now consider the behavior of left invariant vector fields under mappings. 
Let ¢: G — H be a homomorphism between two Lie groups G and H with 
identity elements e and e’ respectively, and push-forward map ¢, :T.G > Te H. 
Consider a tangent vector Xe € T-G generating a left invariant vector field X. 


X cite — "4 Ba 


| 


oO 


gEG 
|ts Loia) 
Gt aH 


Fig. 7.6: Lie Algebra Homomorphism 


So, if g € G, then Lg X = X. Denote by Y the left invariant vector field in H 
whose value at the identity is Ye = ¢,X-. As shown in figure 7.6, we have, 


po Lg = Legg) (5) Q. 
Then, for the push-forward of the vector field X at g, we have, 


Px Xg = Qx Lgr Xe, 
= (ġo Dy ates 
= (L(g) 0?) «Xe, 
= LggudsXe, 
= LoggxYe', 
= Yocq)- 


7.2. LIE ALGEBRAS 255 


Therefore, the push-forward of a left-invariant vector field X is a left-invariant 
vector field Y. Since the push-forward preserves brackets, the map ¢, is a Lie 
algebra homomorphism, that is, 
hela X! + bX?) = ap, X! + b6,X?, 
Ta Gale Ga = [0 X", bX]. (7.49) 


7.2.1 The Exponential Map 


Let 6: R > G be a smooth Lie group homomorphism, and let A = ¢’(0) € 
TeG be a tangent vector at the identity. Such homomorphism is called a one- 
parameter subgroup of G. Let G = GL(n,R). Consider the case where G = 
GL(n,R). This is a matrix group, so a tangent vector at the identity is a 
matrix; this is the reason why we changed the notation from X to A. Since 


p(s +t) = 9(s) 0 ¢(t), 


evaluating the derivative at s = 0 gives, 


L AO = P0) 40, 
= Ad(t), with, 
o(0) =I. (7.50) 


Here, the dot is matrix multiplication. By analogy to the one-dimensional case, 
the solution of this differential equation is, 


elt) =e“, (7.51) 


where the exponential of a matrix is defined as the power series, 


eA=14+A4 | SE TN (7.52) 


We see that the one-parameter family of matrices ¢(t) = e^t 


curve of the vector A. This leads to the following definition, 


is the integral 


7.2.10 Definition For any Lie group G, we define the exponential map 
exp:g9> G (7.53) 


as follows. Let A € T.G, and let 6: R > G be the unique homomorphism such 
that ¢’(0) = A. Then, 
exp(A) = e^ = (1) 


Clearly, 


256 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


The map t + e^ is a local diffeomorphism from TG to G. The maximal 


extension of the integral curve is the one-parameter subgroup of G indicated at 
the beginning of this subsection. The converse is also true. Any one-parameter 
subgroup of G is generated by a map t +> et^, for some A € TG. Since 
g = L(G), there is a one-to-one correspondence between the Lie algebra of a 
Lie group and the one-parameter subgroups. Roughly speaking, the exponential 
map yields a neighborhood of e € G, which is filled by one-parameter subgroups 
emanating from e by the integral curves of tangent vectors A € TG. In fact 
if the diagram in figure 7.6 and the result in equation 7.49 are applied to the 
homomorphism ¢ : R — T with the condition that, 


pa (ilo) = A, 


then ¢ is a Lie algebra homomorphism. The left-invariant vector field generated 
by A is the vector field tangent to the unique integral curve given by the map 
tr et^. We define, 


In(e4) = A, e®O+4) = 144, 
wherever the formal power series converges. If A and B are two matrices near 
0, we can study the behavior of 
In(e“e?) =I +A+4-4...)0+B4+ 5 +...) 
If we only retain the first order terms, we get, 
In(e4e?) =In(1+A+B)=A+B. 


If we wish to compute the quadratic terms, we formally multiply the power series 
for the exponentials, and then use the formal series expansion for In(1 + X) = 


xX — 4X 24 .... However we need to be careful with the fact that matrix 
multiplication does not commute. The result is, 
A? B? 
In(e4e?) = In (1+448+5 448+ +..] 
A? B?\ 1 A? B?\? 
= t } } + t + AB 4 tt... 
G paps) (ans # +453) 
A? B? 1 
= ' ' AASB od Pay 
A+B 5 AB 5 5 ( + y+... 
1 
=A+B 5A, B] baad 


The full expansion is called the Campbell-Baker-Hausdorff (CBH) formula. The 


terms up to third order are, 
1 1 
In(ete?) = A+ B+ LA, B] t T/A. [4, B] - GIB IA BI +... (7.54) 


All the terms of order two or higher are expressible in terms of brackets, so 


[A, B] = 0, => efe?” = e^t8, 


7.2. LIE ALGEBRAS 257 


Exponentiating the CBH formula 7.54 and keeping only the terms up to order 
2, we can establish the following formulas, 


7.2.11 Theorem 


et^etP = exp {t(A+ B) + 1P[A, B] + O(t)}, 
e tAg-tB etAetB — exp if [A, B] + o(t)} : 
et^etP etA = exp {tB + t?[A, B] + O(t*)} . (7.55) 


The first of these formulas follows immediately from the CBH formula. The 
other two require more manipulation of exponential and logarithmic expansions 
along the same lines as in the computations leading to 7.54. A complete proof 
of the this theorem can be found in [34]. Next, we prove that the exponential 
map is natural with respect to the push-forward. 


7.2.12 Theorem Let 6: G — H be a smooth homomorphism between two 
Lie groups G and H. Then the exponential loop in following diagram commutes, 


TR =, Fe—4 TF 
| exp| exp 
 — ee 
Sex oo: 
> 
That is, 
po exp = expo dy. (7.56) 


Proof Let A € T.G and let a: R — G be a one parameter subgroup of G 
given by a(t) = e'4. Then, A = œ' (0) = a,(4|o). Define y : R + H by the 
composition Y% = doa. That is, 


We have, 


so @ is a one parameter subgroup of H. The rest of the proof amounts to un- 
tangling the definition of the push-forward as shown in equation 1.25. Suppose 
that B € Te H such that Y(t) = e'8. That means that B = y'(0). We consider 


258 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


the action of this tangent vector on an arbitrary smooth function f : H > R. 
We have, 


II 
£ 


CONE) =e &(F)lo, 
(fov)lo, 
(fogoa)lo, 

(0)(fo $) = A(fo $), 
= >A(f). 


We conclude that B = ¢,A. Thus, setting t = 1, Y(t) = et? can be rewritten 


| 


R a Sa 


II 


olet) =e*4, or b(exp(A)) = exp(¢.A) 


which is what we wanted to prove. 


7.2.2 The Adjoint Map 


7.2.13 Definition Let G be a Lie group and let fo € G. For each g € G, 
consider the e Wau automorphism Cy = Lg R” : G —> G given by 9 œ 
99.9 = Lk, (go). We define the adjoint map “Ady : g — g by the linear 
E I 

Adg = Cys = (Lg R7 x: (7.57) 


7.2.14 Definition Denote by Aut(V) automorphism group of all invertible 
linear transformations of some vector space V over R (or C). If V has dimension 
n, then Aut(V) is isomorphic to the matrix group GL(n, R). A homomorphism, 


@:G—> GL(n,R) 


from a Lie group to GL(n, R) is called a (real) representation of order n. If the 
homomorphism is 1-1, the representation is called faithful. If W is subspace of 
V, and ¢(g)v € W for all v € W and g € G, we say that W is an invariant 
subspace. A representation with no non-trivial invariant subspace is called ir- 
reducible. Same idea applies to Lie algebras. Since Ad,’ = Ad, o Ady’, the 
adjoint is a homomorphism, Ad : Œ — Aut(g) is called the adjoint representa- 
tion. The kernel is the center of the group G. 


By equation 7.56, if X € g we have exp(C,X) = C(exp(X)), that is 
=9(e*)g". (7.58) 


Consider the case G = GL(n, R). We evaluate the adjoint on a one-parameter 
subgroup of G. Let Y € g = gl(n,R) and t+ etY be one such one parameter 
subgroup. Then for a matrix g € GL(n,R), the conjugation map gives 


Cyle”) = ge” g™. 


7.2. LIE ALGEBRAS 259 


Taking the derivative with respect to t and evaluating at t = 0, we get 
Ad,(Y) = 9Y g". (7.59) 


Now, we evaluate along the derivative of the adjoint map at t = 0, along another 
one-parameter subgroup t +> e, 
AdaxY =e'*Ve™, 
=(1+tX + 4PX? +.. Y- tX + 4PX? +...) 
=1+tiX, Y] + e8), 


d 
— Adetz Y ht=0 = |X, Y 
dt lt 0 [ ’ 
We denote the quantity on the left hand side above by the notation adx Y 
adxY = [X,Y]. (7.60) 


Equivalently, adx E€ End(g) is an endomorphism given by the map Y +> [X,Y]. 
The reemergence of the of an operator yielding the Lie bracket is indicative that 
there is a Lie derivative floating around. The details are easy to clarify. The 
map y(t) = g(t) = e'* is a local diffeomorphism generated by the flow of 
X € TeG. The Lie derivative of a vector field Y is given by 


d 


£x, T = — (pr Y 


a dt 16 


The push-forward of the conjugation map Cg = Ry Dy action on Y gives, 


g 
= (R7')Y, sinceY is left invariant, 
AdaxY = (R7 `)Y, 
d di ao 
det ¥ le=0 = a Pa ey li=0; 
which gives, 
dxY = £xY. 


We conclude that, 


Adexp x = exp(adx), 
Adex = e°? =1+adx + #(adx)?+..., (7.61) 
where the first term represents of course the identity element in the algebra. 


If the Lie algebra is finite dimensional, we can define the Killing form as the 
form B given by, 


B(X,Y) = Tr(adx o ady) = Tr(adxady). (7.62) 


260 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


The Killing form is not a differential form, but rather, a symmetric bilinear 
entity that plays the role of a metric in the Lie algebra. In the adjoint repre- 
sentation, the Killing form is adjoint invariant, meaning, 


B(adxY,Z) + B(Y,adxZ) =0 
In terms of adx, the Jacobi identity in definition 7.2 becomes, 
[adx ,ady|Z = adjx, yjZ, (7.63) 
which shows explicitly that ad is a Lie algebra homomorphism. 


The adjoint map can be used to prove an interesting formulation of the CBH 
formula first proved in 1899 by Poincaré [13]. Let, 


w 2 Btw" 
p= 1 — e7” > n! 


be the generating function for the Bernoulli numbers. Define, 


zlnz 
z—-1 


g(z) = (ln z) = 


Then, the Campbell-Baker-Hausdorff formula can be written in the form, 
1 
In(e*e®) = A+ ii gle% et 942) B dt. (7.64) 
0 


The formula is complicated to use for explicit evaluation of the terms in the 
expansion, but nonetheless is a neat result because it makes it manifestly clear 
that the expansion depends only on the brackets. Following Hall [13], we illus- 
trate how to get the first three terms. Set z = v + 1 and expand g(v + 1) ina 
Maclaurin series, 


a4 
g(v +1) = Inv +1), 
v+1 
=a (v ty? ty? J 
= #410- vt pt), 
1+4v TE 


Next, we set e°% etade — 1 = v and evaluate g up to second order in ad4, adp. 


We ignore terms that contain B on the right of adg, since adgB = 0. 
v= |I +ad4 + ¿(ada)? +...) +tadg + $t?(adg)? +...) —J, 
=ad4 + 4(ad4)* +t adg + $t?(adp) +..., 
v? = (ada)? + t ad4 adg +... 
glv+1)=I+ 4 (ada + $(ad,)”) — é ((ada)* +tadp ada) +... 


1 
/ glv + 1)dt = I + $ad4 + (ada) — badeg ada +... 
0 


7.2. LIE ALGEBRAS 261 


Hence, 


In(e“e®) = A+ B + 314, B] + 914,14, Bl] - 1B, 14, B]] +... 


7.2.3 The Maurer-Cartan Form 


Let G be a Lie group. A differential form 0 in G is called left-invariant if 
L0 = 86. (7.65) 
That is, if the form at point g, is given by 6, , then, 
L599, = g-ig: 


The vector space g* of left-invariant one forms is the dual space of the Lie 
algebra of g left-invariant vector fields. Recall that the Lie algebra of left- 
invariant vector fields is isomorphic to the tangent space at the identity T.G. 
If X is a left-invariant vector field and @ is a left invariant one-form the 0(X) 
is constant. If 0 is left invariant, then for each g € G 


L% (d0) = d(L*0) = dd, 


so dé is also left-invariant. The canonical form or Maurer-Cartan form of G is 
the form w is the form that assigns to a left-invariant vector field, its value at 
the identity, that is, 


w(X,) = L,-1,X,g, or equivalently, 


w(X) =X, for all X € T.G. 


The Maurer-Cartan form w is left-invariant. On the other hand (the right), we 
have, 


7.2.15 Theorem 
Ryw = Ad,-iw. (7.66) 


Proof Let X € g generate a left invariant vector field in G via the flow of the 
exponential map X 4 e!*. Let g, € G. By definition, wg, (Xo) = X. Then 


(Ryw, Xa )= wg g (Ros Xo, yy 


d 
= T Loos be 9)li=0, 


= lg) (ge ale 


Cae 
= al tetX gli—o, 
ae a 


= Adg-1wg, (Xz, ) 


262 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


Suppose {ea} is a basis for the Lie algebra of left-invariant vector fields g. 
The bracket of two vectors in the Lie algebra must be expressible as a linear 
combination of the basis vectors, so there exist constants Cag such that 


lea, e8] = Cagey. (7.67) 


The quantities Cag are called the structure constants. Since [ea, eg] = —[eg, €a], 
the structure constants are antisymmetric on the lower indices. The frame 
{ea} is called a Maurer-Cartan frame. Let {w°} be the dual basis, so that 
w(eg) = 63. By definition, applying the Maurer-Cartan form to eq returns 
the value of e, at the identity. This gives an almost tautological expression 
for the components of the Maurer-Cartan form in terms of the Maurer-Cartan 
coframe, 
W = eale) @w™. 


Applying the definition for the differential of a one-form 6.28, 
dux(X,Y) = X(w(¥)) — ¥(w(X)) — (X, Y]) 
we get, 


du (eg, ey) = ep(w*(ey)) — ey(w*(ea)) — w (les, €y]), 
= eg(d)) — ey(5g) — C7 apw (ey), 
= =C gy. 


Using the antisymmetry of the wedge product and the antisymmetry of the 
structure constants in the lower indices, we can rewrite the last equation as, 


1 
dw® = -30 po? Aw. (7.68) 


This equation of structure is called the Maurer-Cartan equation. Let X,Y 
be left-invariant. Since w(X) and w(Y) are constant, we have X(w(Y)) = 
Y (w(X)) = 0, so using the definition 6.28 for the differential of a one form, we 
can also write the Maurer-Cartan equation as 


dw(X,Y) = —w([X,Y]). (7.69) 


There is an annoying factor of 1/2 which makes the notation inconsistent in the 
literature. Some authors include such a factor in the equation of structure 7.69, 
but typically those authors also include a 1/2 their definition of the differential 
of a one form dw(X,Y). Other authors restrict the sum in equation 7.68 to 
values i < 7, so the factor 1/2 does not appear there. Yet some others avoid the 
bracket notation altogether, or they invent a new hybrid wedge/bracket [w, w] of 
forms that may account for the 1/2. In this latter case, the 2-form represented 
by the wedge/bracket? is usually interpreted as a section of A? Q g & g. 


2If a, B € Q! Q g are Lie algebra valued one-forms, the usual definition of the bracket. is 
la, B)(X, Y) = [a(X), B(Y)] — [a(¥), B(X)]. Thus [a, a](X, Y) = 2[a(X), a(Y)]. 


7.2. LIE ALGEBRAS 263 


If G is a matrix group, real or complex, the Maurer-Cartan form 3.41 can be 


written as 3, 


w = A'dA, (7.70) 


where, g = A € G plays the role of the attitude matrix introduced in section 3.4. 
The form w = g` ‘dg is clearly left-invariant, because if g, is another constant 
matrix, then 


(9.9)4(9.9) = 97'dg. 


We can express the components of the Killing form in terms of the structure 
constants. First, we compute 


(ad. ade, )ey = ade, (leg, €y]), 
= [ea, leg, ezl], 
= [ea, C7 pyeol, 
= C’ aol By€p- 


Taking the trace means we perform a contraction with the dual form w which 
is the same as setting y = p on the coefficients on the right. We get 


Bag = C’ acC? pp. (7.71) 


Notice that the components of the Killing form result from the contraction of 
the antisymmetric indices of the structure constants; this yields the simplest 
symmetric tensor that can be constructed from the structure constants. Cartan 
used the Killing form to characterize an important attribute of Lie algebras 
called semisimple. A non-Abelian Lie algebra is simple if it has no, non-trivial 
proper ideals. A semisimple Lie algebra can be decomposed as the direct sum 
of simple Lie algebras. The Cartan criterion states states that the Lie algebra 
is semisimple if and only if the Killing form is non-degenerate. Of course, the 
structure constants depend on the choice of the basis. But, since the form is 
symmetric, it can be diagonalized by an orthogonal matrix, so it can be classified 
by the eigenvalues. If the Lie group is compact and semisimple, the Killing form 
is positive definite and in diagonal form, all the entries are positive. The most 
salient achievement of E. Cartan was to provide a complete classification of 
semisimple Lie algebras. Included in this classification are all the Lie algebras 
associated with the special Lie groups mentioned above in this chapter. 

It is also easy to express the adjoint representation in terms of the structure 
constants. All is really needed is to define matrices Tẹ whose components are, 


eg I ie 
We then get yet another manifestation of the Jacobi identity 7.2 in the form, 
[Ta, T8] = C’apTy. (7.72) 


This is the component version of equation 7.63, and it shows that the structure 
constants themselves, generate the adjoint representation. 


3For a matrix group, w(Xg) = L,-1,(Xg) = dL‘ (Xg). So, the logarithmic differential 
maps the vector field Xy to its value at the identity e. 


264 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


7.2.4 Cartan Subalgebra 


To get at the physics applications of Lie algebras we need to dip our toes 
into representation theory. Actually, what we should say is that it is the physics 
that lead to the development of representation theory by giants like Cartan and 
Weyl. 


7.2.16 Definition A Cartan subalgebra h € g is a nilpotent subalgebra that 
is its own normalizer. The dimension of the Cartan subalgebra is called the 
rank of g. 

Every pair of elements of h commute, and every element of g that com- 
mutes with all elements of h is in h. In this sense, a Cartan subalgebra is the 
maximum number of commuting generators of the Lie algebra. The idea is to 
simultaneously diagonalize the basis of h. The eigenvalues are used to label the 
states of the system. 

Using the Killing form, one finds an orthonormal basis {h;} for h and extends 
to a basis of g 


{ha, h2... , hk, 5 91, 9—1, 92) 9-2) +++ 9Gn—k»G _n—k} 
2 2 


with the following properties: 

1. [hi, hy] = 0. 

2. [h,g] = A(h)g for all h € h and 0 Æ g E€ g. Or, in terms of the basis, 
[hi 93] =P? 95, 

3. [97, 9-3] € b. 


The first property is a re-statement that each pair basis vectors in h commute. 
The second property is a kind of generalized eigenvector equation for adp. For 
each gj we associate a position vector rY) = a, AP, ; AD), These are 
called the roots and the set of all roots is called the root space. The plot of the 
root vectors in R* is a set of arrows that exhibits certain reflection symmetries; 
the set is called the root diagram. Root spaces lead to Cartan’s classification of 
semisimple Lie algebras. The classification can also be visualized by a scheme 
called Dynkin diagrams. The basis elements gj and g_; are called the raising 
and lowering operators. We will show in the next chapter how this abstract 
machinery, leads to real concrete results in physics. Lie symmetries of physical 
systems are interconnected with the deep subject of representation theory. 


7.3 Transformation Groups 


The importance of this chapter for the purpose of applications to physics 
is that in many physical models, Lie groups are manifested as transformation 
groups that act on the system. In the simplest case we have linear transfor- 
mations in R” which in a particular basis, can be represented by matrix mul- 
tiplication of an element of the general linear group GL(n, R) with a vector. 


7.3. TRANSFORMATION GROUPS 265 


The group of rotations in R° constitutes a symmetry group in the dynamical 
system of rigid body motion, and the Lorentz group is the essential symmetry 
group of space-time. The Lie algebra of a Lie group is basically a first order 
symmetry approximation. If one thinks of Lie group such as the rotation group 
as a symmetry group acting on a manifold, then an element of the Lie algebra 
represents an infinitesimal transformation near the identity of the group. Bases 
vectors of the Lie algebras of the corresponding Lie group transformation are 
then interpreted as generators of infinitesimal transformations. The exponen- 
tial map provides a bridge between full elements of the group and infinitesimal 
transformations represented by elements of the Lie algebra and elements of the 
Lie group. The Lie algebra is determined by the structure constants and the 
Maurer-Cartan equations. 


7.3.1 Definition Let M be an n-dimensional manifold and G a Lie group. 
G is called a Lie transformation group on (the right of) M, if there exists a 
smooth map u: M x G> M 


(p,g) = p-g=R,(p), for (p,g)€M xG, 


such that for each p € M, 

1)p-e=p 

2) (p: g1): (92) =p: (91-92), for every 91,92 € G. 
Of course, a one-parameter group of diffeomorphisms in the sense of definition 
7.1.1, is a special case of a Lie transformation group, with G = R. Another 
example would be the action of the rotation group SO(3,R) on a sphere S?. 
We are often interested in linear representations of the group acting on some 
vector space, in which the action respects the linear structure. An example of 
this, would be adjoint representation of the action of a Lie group into itself. 


7.3.2 Definition Let G be a transformation group on M with identity e, 
and let p € M be any point in the manifold. We say the transformation is, 


1. Effective, if the Kernel K = {g € G : p- g = p} = {e}. In other words, if 
g # e, there exists a point p, such that p -g # p. The Kernel of a group 
is a normal subgroup. If the Kernel is not trivial, the action of G on M 
is not effective, but the action of G/K is. 


2. Free, if g £ e, then p-g # p. In other words, if g Æ e, then there is no fixed 
point for g. If there is a fixed point p, we define the isotropy subgroup of 
p as, Iso(p) = {g E G : p: g = p}. 


3. Transitive, if p Æ q, then there exists a g such that p- g = q. The set of 
all q such that p- g = p for some g, is called the orbit of the point p, and 
it is denoted by Gp. The map p ++ Gp which sends p to p -g defines a 


diffeomorphism G/Iso(p) Ž, Gp. 


Unless otherwise stated, we assume that when we say that a Lie group 
acts on a manifold, the action is on the right, and the action is transitive and 


266 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


effective. Let X be an element of the Lie algebra g of a Lie group G acting on 
M, and let p€ M. The one-parameter subgroup given by the exponential map 


Xe” 
generates a curve on M given by 
pilp) = pe* = Rex (p), (7.73) 
with yo(p) = p with tangent vector X* = o(X) at p given by, 
d 
o(X)|p = qe dit=0 (7.74) 


If U is a neighborhood of p, the map y;(p) above constitutes a local one- 


Fig. 7.7: Fundamental Vector 


parameter group of transformations of M associated with the vector field X* = 
a(X), called the fundamental vector field. It is not really possible to draw a good 
picture of a fundamental vector field, since for starters, all but the most trivial 
principal fiber bundles, either live in higher dimensions, or have complicated 
topologies. Nevertheless, figure 7.7 may be of some help in visualizing these 
vector fields. 


7.3.3 Theorem 
(Rg)«(a(X)) = o(Ad,-1X). (7.75) 


Proof Let y; but the one-parameter group of diffeomorphisms associated with 
o(X) at p and y, be the one-parameter group of diffeomorphisms associated 
with (R,)«(a(X)) at pg. The map Rg : M —> M is a local diffeomorphism, so 
as in equation 7.9 we have the commuting diagram, 

M M 

get w+ 

M > M. 

Thus, we get, 
pi = Rg 0 pt0 Rg-1, 
= Rg o Retx O Rg-1, 


= Rg-tetXg, 


7.3. TRANSFORMATION GROUPS 267 


The one-parameter group of diffeomorphisms {g7 te** g} is generated by Ad,-1X, 
so the vector field associated with Y is o(Ad,-;X). We summarize this in the 
following diagram, 


X (M) X (M) 
ot ot 
Ad,-1 
g——~ g 


It might be instructive to present a second proof in the style of theorem 7.66 


d 
(Rgx)o(X) = = Ry (pe )|t=0, 
d 
= dt pe” gļ:=0, 
d ~1) 4X 
= Plgg je" gli=o, 
d a 
= di (pg)g TeX Ghigo, 
d 
~ Gt (pg)e*4o-1*] 9, 
= OpgAd,g-1X. 


Here, we have used equation 7.58 in the next to last step. This formula is 
consistent with the formula for the pull-back of the Maurer-Cartan form 7.66 
by the following computation, 

Ryw(o(X)) = w(Rgxo(X)), 
w(o(Ady-1)X), 
= Ad,-1w(a(X)). 


II 


The map 
o:g—> X(M) 


given by 
XH X* =90(X) 


can also be viewed in alternative way by considering the map 
op : G > M, 
g œ plg) = p9. 


Then 

ao(X)(p) = op (X )e 
This small variation of the definition of a fundamental vector is helpful in es- 
tablishing the following, 


7.3.4 Theorem The map ø is a Lie algebra homomorphism, that is, 


ol([X,Y]) = [lo(X),o(Y)]. (7.76) 


268 CHAPTER 7. GROUPS OF TRANSFORMATIONS 


Proof Aside from a small change in notation, the proof here is the same as 
in Spivak [34], and in Kobayashi-Nomizu [18]. Let &(p) = e = R.+xp be the 
one-parameter group of diffeomorphisms associated with X € g, and let Y € g. 
We extend X and Y to left-invariant vector fields in G. By theorem 7.1.8, we 
have 


[X, Y] 7 £xY, 

sü Ye (EY Je 
im ; 
t30 t 
A Ye — (six, Y Je 

= lim : 
t>0 t 

Ye — (Ad,-ix Y Je 

= lim emi =) , asin the proof of theorem 7.1.8. 

ane 


Denote by Rg : M —> M, the map R,(p) = pg, then 


a) ex. 


(Reex 9 Ope-tx )(g) = p( g 


Thus, once again by theorem 7.1.8, the Lie bracket of the fundamental vectors 
gives 


OY )p — [Rexo (Y)]p 


lo(X),o(Y)] = lim 


ki 


t0 t 
= lim Cp Ke E Adei De as in theorem 7.3.3 
t0 t 
— lim 2P* Ye — Ops (Ade-tx We 
t0 t 
Voie): 
= Opx lim — ZG) : 
t30 t 
= 0([X,Y]) 


For the time being, the results in theorems 7.3.3 and 7.3.4 might appear as a 
pure formality, but as we will see later, they are instrumental in the treatment 
of connections on principal fiber bundles. The first of these two formulas tell us 
how to push forward fundamental vectors along the orbit of right-translations. 
The second theorem states that the fundamental vectors constitute a Lie algebra 
that is completely determined by the lie algebra of the group. The two results 
are used in interpreting the meaning of connections on principal fiber bundles, 
as later defined in 9.3.2. 


Chapter 8 


Classical Groups in Physics 


In this section we present a pedestrian view of some of the common Lie 
algebras and classical Lie groups that appear in mathematical physics. 


8.1 Orthogonal Groups 


8.1.1 Rotations in R? 


Let z = x + iy be a complex number, and consider the map introduced in 
section 5.2.2 
; a £T Y 
es : 
(erin >| * 4 
The map is clearly a vector space isomorphism between the complex numbers 
and a subset of the set 2 x 2 matrices. The map can be written as 


(a + iy) “> zI + yJ, 


where I is the identity matrix and J is the symplectic matrix 5.50, with J? = 
—I. Define 
U(1) ={z eC: |z|=1} 


to be the group of unimodular complex numbers. If z € U(1), then we can 
write z in the form z = et. The map 0 —> e” is not 1-1 because replacing 0 by 
0 + 27 gives the same number. The Kernel of the map is the set {27Z}, that 
is, the integer multiples of 27. U(1) acts on C by multiplication, which results 
on a rotation by 6. The action passes to the circle S! & R/(27Z). By Euler’s 
formula, et? = cos@ + isin, so a restricts to a map from U(1) to the special 
orthogonal group SO(2, R) consisting of 2 x 2 rotation matrices, 
i0 a cos sinf 
$ ma E sin ð el i 


It is an elementary exercise to verify that 


269 


270 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


eft . e2 _° . Ry . Rg, that is, 


cos(@; + 62) — sin(@1 + 62) 


i(01+62) a pan 
E Ro —sin(O; +02) cos(@1 + 02).| ` 


The exercise amounts to multiplying the rotation matrices and recognizing the 
summation formulæ for sine and cosine. Thus, the map is a smooth Lie group 
homomorphism. Since the map is also a diffeomorphism, we have a Lie group 
isomorphism U (1) = SO(2,R). For reasons that will become apparent in com- 
paring later with the discussion of rotations in R3 by quaternions, we show 
the expression for the matrix rotation Rọ as the product of two consecutive 
rotations by 0/2, by means of the double angle formulas 


cos? £ — sin? £ sin £ cos £ | 
Rg = 
et) 0 20 _ n20?’ 

—2sin cos cos” 5 — sin” 5 

2 2 
SA do ~ qi 240% (8 1) 
TI 2 72}? $ 

qoqi 0—41 


where qo = cos $ and qı = sin g, 
Now, consider the exponential map ¢ : R + SO(2, R) given by, 


(8.2) 


$ ta _ | cost? sinté 
$ a E sintO cost@| * 


A matrix e'4 is orthogonal if (et4)T = (e'4)-! = e~'4, so this implies that 
AT = —A. Per our previous discussion, the matrix A is a representative of the 
Lie algebra, so the Lie algebra of the orthogonal group consists of antisymmetric 
matrices. For the special orthogonal group with matrices with det e4 = 1, the 
formula 5.45, 


det e^ = e™ 4 


implies that elements A of the Lie algebra also have 0 trace. A = ¢/(0), with 
(0) = I. so, 


pie cost@ sint@ 
~ dt |—sint@ costé ‘ie. 


= E 4 =6 e i . (8.4) 


This means that the matrix J is a basis for the Lie algebra so(2, R). This 
example is as simple as it gets, but there are some good lessons to be learned. 
As it was discussed earlier, the significance of the Lie algebra element A and 
the basis J is that they constitute the generators of an infinitesimal rotation 
near the identity. This becomes clear if one looks at the rotation matrix with 
0 small. Then, cos@ ~ 1 and sin ~ 0. To first order, the rotation matrix is 
given by Rọ ~ 1+0J = I+ A. We verify that the exponential map gives 


(8.3) 


8.1. ORTHOGONAL GROUPS 271 


the group element from the infinitesimal generator. The computation is almost 
identical to the proof of Euler’s formula by Maclaurin series. The key is that 
J? = —I, J? = —J, etc., so that 


eA =I+A+4A +H A + GA + R44... 
=1+0S+ 3(0J)? + £(0J)? + AON + AOT... 
=(I- 40 + 0+ +) 4+ (0 h 4+ FOP +...) 


Hence 
e? 7 = (cos6)I + (sin 8) J, (8.5) 


which is the original rotation matrix. The irreducible representations of U(1) 
are the trivial representation and the 2 x 2 matrix representations given by the 
maps, 

cos2n@ sin 2né 


+ 
—sin2n@ cos2né}’ MEAT 


Pn (e? )= 
so basically, the irreducible representations are the homomorphisms of the group 
into itself. 


8.1.2 Rotations in R 


The Lie group SO(3,R) consists of 3 x 3 orthogonal matrices with deter- 
minant equal to 1. The Lie algebra s0(3, R) is the set of 3 x 3 antisymmetric 
matrices with zero trace. The zero trace condition is superfluous since the di- 
agonal elements of an antisymmetric matrix are zero. In consideration of the 
case above for so(2,R), we choose as basis for s0(3, R), the matrices. 


1 


© 


0 0 0 0 0 0 1 0 
ar= |0 0 1f, ay=j} 0 0 >, A@,=}-1 0 0 
0 -1 0 —1 0 0 0 0 


© 


The exponentials R, = e%*%!, Ry = e®v®2, R, = e%" represent rotations about 
the x,y and z axes respectively. In explicit form, these matrices are, 


1 0 0 cos 62 0 sin 02 cos 63 sin 63 0 
R2(0) = | cos 01 an | R R,(0) = | 0 1 O | x R-(0) = [e cos 03 | á 

0 — sin 64 cos 01 — sin 02 0 cos ĝ2 0 Oo 1 
Any rotation in R can be obtained by a composition of rotations about the 
x,y and z axes. However, the standard in physics is to utilize the Euler angles 
{¢,9,} introduced by Euler to study the motion of rigid bodies. The general 
Euler angle rotation is obtained by the composition of three rotations carried 
as follows (See figure 8.1). 


1. Perform a rotation R,(@) by an angle ¢ around the z axis. We label the 
new axes as {€,7, z}. 


2. Follow by a rotation R¢(@) by an angle 0 around the new x axis, which 
in step (1) we labelled €. We label the new axes as {&, 17’, z’}. 


272 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


Fig. 8.1: Euler Angles 


3. Finish with a rotation R,-(w) by an angle y% around the new z-axis, which 
in step (2) we labelled z’. The final axes are labelled {x’, y’, 2’} 


The rotation matrices are 
cos@ sing 0 1 (0) 0 cosy sinw 0 
R.(¢) = [sins eve o| , Re(0) = [e cos 8 one | WR (w) = [sins cos w | R (8.6) 
0 o i 0 — sin cos0 0 ou 
A straight-forward matrix multiplication yields the full rotation R = R,(¢@) - 


Re(9)- Ra (4%), 


cos w cos ġ—cos 0 sin ġ sin w cos w sin ġ+cos 0 cos ¢ sin Y% sin Y% sin 0 
R= | —sinseono—sin sin ony — sin w sin ¢+cos 0 cos ¢ cos y% conv ` (8.7) 
sin 0 sin — sin f cos cos 0 

Since R is the product of orthogonal matrices, the matrix is also orthogonal and 
R! = RT. If we consider the unit 2-sphere S? = {(x,y, z) : a? +y? +22 = 1}, 
the rotation gives a map R : S? — 9?. Any rotation of S? can be viewed 
as a composition R of the three Euler angle rotations, or as a single rotation 
around an axis pointing towards the image of the north pole along the axis 7’. 
In principle, we should be able to prescribe a direction and an angle as the data 
to find a matrix representing a rotation by the given angle around the given 
direction. Finding this data in terms of the Euler angles requires a bit of work. 


8.1.3 SU(2) 


In this subsection, we develop a representation of rotations in terms of 2 x 2 
complex matrices. As discussed in section 1.4, orthogonal transformations in 
R” are isometries, so they can be described as the group of transformations that 
preserve length. In R? the length is given by g(X, X) = z? +y? +y? under the 
standard metric. Getting a little ahead of ourselves, let’s consider the metric 
n = diag(+ — ——) for Minkowski’s space as in 2.35. Let x = (t,2,y,z) be 
the components of a vector in M3. Consider the map from Mı, 3 to a 2 x 2 
Hermitian matrix given by, 


t+z x£-ty 


nam AB _ 
x (t, £,y, Z2) > £ pa a 


| pnas G5) 


8.1. ORTHOGONAL GROUPS 273 


The index notation for the matrix X = (X AB) is meant to elucidate the prop- 


erty of Hermitian matrices for which, the complex conjugate x equals the 
transpose XP4, The bar index notation was used in early work on spinors by 
Veblen and Taub. Bar indices were later changed to prime indices X = (X44’) 
in some seminal work by R. Penrose in the context of twistors. When conve- 
nient, we will invoke the Penrose notation. The map is chosen so that, 


|z“ ||? = det X = det(x4), 
We can write the matrix in terms of a basis, 


pAb ie > 
xr+iy t-z 


_,fi 0 01], , [0 -, fr 0 
= Gn) Ee OS a> OG | A161. 412 
gå, 


= zt 


(8.9) 


LQ 0 1 0 —i 1 0 
"= | i a=[} ie oa | ak "=l e (8.10) 


Here, co = J and {0;, i = 1,2,3} are the Pauli matrices. For now, we constrain 
to the spatial part of the matrix by restricting to indices t, j ... = 1,2,3. Since 
det(X) is equal to the metric we wish to preserve, we seek unitary matrices 


_|o $ 
Q= e d € SU(2), 
such that det(X) is invariant under a similarity transformation. specializing to 


R? by setting the coordinate t = 0, the similarity transformation reads 
X =QXQ, 
z z—iy| _ ja B z x—iy|| ô -8 (8.11) 
E+iy -z y oO} jatiy z y aj’ i 
The quantities {a, 8, y, ô} are called the Cayley-Klein parameters. The struc- 
ture of the Lie algebra su(2) can be obtained in a completely analogous manner 


as was done for the orthogonal groups. If t +> et^ is a one-parameter subgroup 
of SU(2), then the inverse of et^ is equal to its Hermitian adjoint, that is, 


Ayr Se, 


Taking the derivative at t = 0, we find that At = —A. The formula det(e4) = 
eTA shows that if det(e4) = 1 then TrA = 0. We conclude that the Lie 
algebra, 

su(2) = {A € GL(2,C): A‘ =—A, Tr(A) = 0}, (8.12) 


274 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


consists of all traceless anti-Hermitian 2 x 2 matrices. This means that the 
Cayley-Klein parameters are not all independent, as they must satisfy the con- 
ditions 

y=-ß, 6=@. 
The rotation matrix in R3 represented by the Cayley-Klein parameters can be 
obtained by direct computation of the matrix multiplication 8.11 and picking 
out the coefficients of the transformed vectors. One may use the trick of setting 


Te =xU+1y, t- =g =t 


as done in Goldstein [11], or one can apply the transformation to the basis 
vectors given by the Pauli matrices, or these days, one can simply insert into 
computer algebra system. The resulting matrix A is given by 


ee ik eet a MG a, 
(a? +y — 62-8?) 4a? +724 62467) ilab ys) (8.13) 
Bô — ay ilay + Bô) ad + By 


A= 


By inspection, the set of Pauli matrices o = {01, 02,03} is a basis for the Lie 
algebra. A quick computation gives, 


o? =o08 =03 =], det(c;) = —1. 


and structure constants, 


where Fij is the Levi-Civita permutation symbol 2.41. 

The factor 2i in the formula for the structure constants 
creates a minor conflict between physicists and mathemati- RA 
cians, but this is historically unavoidable. For example, from 
the second permutation symbol identity in 2.46, it follows im- J» 
mediately that, because of the i factor, the components of the eA O 
Killing form are —4ôð;;j. Thus, for a physicist, a Lie algebra 
is compact if the Killing form is negative definite, which is 
the opposite of what was stated earlier. In quantum mechanics, it is customary 
to denote the set of Pauli matrices by a vector-like notation o = (01,02,03), 
in which case, the spin operator in the spin 1/2 representation, is written as 
J= do. The multiplication table of Pauli matrices exhibits a cyclic permuta- 
tion feature as shown in the adjacent figure. We can immediately verify that 
0102 = i03, 0203 = io, and 030, = io2. Thus, as shown in the diagram, the 
product of two Pauli matrices gives 7 times the third matrix if the product is 
taken clockwise, and —7 times the third matrix if traversed counterclockwise. 
At the center if the triangular diagram there is an i as part of a reminder in the 
pneumonic, not to forget this factor. Since the squares of the Pauli matrices 
give the identity, to get an analog of Euler’s formula in matrix form as in equa- 
tion 8.5, we use the set {io1, i02, i03}. Like J, these matrices all have squares 
equal to —I. 


8.1. ORTHOGONAL GROUPS 275 


At this point, we inject the observation that the algebra of Pauli matrices 
is very closely related to the set H of quaternions. A quaternion is an entity of 
the form 

q = qol + nit qj + ak, (8.15) 


where the basis elements satisfy, 


P? = j° =k? = ijk = —1. 


As a vector space, the space H of quaternions is isomorphic to R* The com- 
ponents (q1, 42,93) are in 1-1 correspondence with R3 vectors. It quickly fol- 
lows that, ij = k, jk = i, and ki = j. A 2 x 2 matrix representation of 
the quaternion basis is obtained by the identity matrix, together with setting 
{i,j,k} = {-ta1, —ia2,—ia3}. Another way of saying this, is to set 


1 = 0302, J= 0103, k = 990}. 


If one interprets the Pauli matrices as representations of linear transformations 
in the complex plane, and since multiplication by 7 represents a rotation by 90°, 
we see that in some sense the vector basis {i,j,k} of quaternion space corre- 
sponds more to something more like dual planes to the basis vectors given by the 
Pauli matrices. This is one of those places where the factor of i in the structure 
constants for Pauli matrices causes differences with mathematical standards, 
the latter following more the elegant algebraic construction by Hamilton. The 
triplet (q1, q2, q3) of the quaternion is called the vector component. If one defines 
the quaternion conjugate by 


q= ql — qi- qj — qk 
it follows that 
lall? = q7 = q6 +97 +95 + o. 


If we set 
Z%=Gotqi, and zı =q@2 +4931, 


we can identify H with C + C j, by writing, 
q= 20+ zj. 


In this notation, the conjugate of the quaternion is given by 


Gq=%-aj=%-jA. 
The complex conjugate on the last term above comes from the minus sign 
introduced by the anti-commutation of i and j, the price for transposing j to 
the front. If q! = wo + w1 j is another quaternion, the right action of q on q’ by 
quaternion multiplication gives, 


q'q = (wo + w1 ))(zo + 215), 
= wo20 + Wwo21 j + w1 j 2o + w1 j21j, 


= (wozo = W121) Cea (woz1 F W120) j. 


276 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


In matrix form the right action of q can be rewritten as 


[wo w1] =e" [wo w] È A . 
Thus, if q is a unit quaternion, that is, one with ||q|| = 1, the matrix on the right 
above is in generic form of an element of SU (2). The set of all unit quaternions 
can be identified with a three-sphere S? € R4. The quaternions form a division 
algebra. If q Æ 0, then, similar to complex numbers, the inverse is given by 


Back to the Lie algebra, the quantities {io1, ic2,io3} represent the infinitesimal 
transformation that generate the elements of the group. Thus, for example, to 
generate a rotation by an angle ¢ around the z-axis, we set, 


Qs = e273, (8.16) 
Proceeding exactly as in the computation leading to equation 8.5, we find 
Qe = cos Í I+isin $ 03, (8.17) 


ett/2 0 
aa fs 


The result is a diagonal matrix since 03 is diagonal, and hence, so is any power 
of o3. Yet another computation of the similarity transformation X= QX Qi p 


where, 
x=| z aed Ela E 
x +y =Z £ +y =z 
yields, 


Z= xcoso+ysing, 
y = —zr sin ġ + y cos @, 
z 


These are of course the correct equations for the rotation. It should be noted 
that in the computation, which we leave as an exercise, we find a natural appear- 
ance of double angle formulas for sine and cosine; this is how the ¢/2 converts 
to a ¢ in the final equation. The next Euler angle rotation is given by, 


Qo = 02", (8.19) 
0 0 

= C08 5 I+¢sin z IE (8.20) 
ê isin? 

s N (8.21) 
Sın 5 cos pj 


8.1. ORTHOGONAL GROUPS 277 


The third Euler angle rotation looks just like the one in equation 8.18 with ¢ 
replaced by w, 


Qy = cos $ I + isin $ o3, (8.22) 
ett/2 0 


The composite rotation is given by 


i+9)/2 cos? — jei(-0)/2 gin 2 
€ COS ve sin 
Q = QyQeQs = Rae sin 2 e~i(Y+9)/2 cos 2 (8.24) 


which gives the Cayley-Klein parameters in terms of Euler angles. It should 
be noted that if Q represents a rotation, then —Q represents exactly the same 
rotation, since the minus sign cancels out in the similarity transformation. In 
this sense, SU(2) is called a double cover of SO(3, R). 
Let n = (n!,n?,n°) be a unit vector, and as before, set o = (01,02,03). We 
call a unit Pauli-Bloch vector, denoted by n-o, the expression given by the 
matrix, 
3 k AB n? n! — in 
no =n OR = | 14 ind eas 


2 


It is very easy to verify that given two vectors a and b, we have 
(a-o)(b- a) =(a-b)I+i(axb)-o 


Although we do not need the result above here, it is very neat that the formula 
which is in essence indicates that the product of quaternions, incorporates both, 
the dot and the cross products. The formula is helpful in establishing identities 
for products of quaternions. With the notation above, the equation, 

62922) — cos §T+i(n-o)sin$ (8.25) 
gives a generalization of Euler’s formula extended to quaternions via Pauli ma- 
trices. This is a beautiful result which was the goal that led Hamilton to 
introduce quaternions in 1843. The formula represents a rotation by an angle 0 
about an axis in the direction of the unit vector n. In terms of Hamilton quater- 
nions, the rotation matrix in R? is obtained by conjugation with a quaternion 
q, 


Š =qXqť`', 

where, 

2 2 2 2 

qô + dt — G2 — 43 Pa 42 — 24048 R 2q1q3 + 2qoq2 
R(q) = R(O,n)= | 2qıq2 +2qoq3 q6 — qi +42 — q3 2q2q3—-2q0m |, 
2q1q3 — 2q0q2 2q2q43 — 2q0q -É -+5 
(8.26) 

with, 


q = qo + qii + q2j + ask, 


= cos $ + [nii + naj + ngk] sin g. 


278 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


This is clearly a generalization corresponding rotation matrix 8.1 in R? in terms 
of half-angle parameters. The formulation is not cluttered by factors of i; this is 
the preferred form for computer scientists, computer game developers, and an 
increasing number of engineers, to code rotations in numerical computations. 
The Maurer-Cartan form w of SU(2) can be computed directly from the Cayley- 
Klein parameters, 


w = Q- dQ. 


In the computation we write the Lie algebra valued form as 


We then compute Q~!dQ and read the forms. The computation is actually 
easier by hand than using Maple, but we recommend a pen with an extra fine 
tip, and working on a sheet of paper in landscape orientation. The computation 
is facilitated by noting that we only have to compute the first column to read 
the components of the form. The result is 


w! = cosddé + sin ọsin 0 dy, 
w? = sin ¢ d0 — cos ọsin 8 dy, 
w? = do + cos 0 dy. (8.27) 


One can then verify that 
dw’? = gE aeo wf, 
from which we get the metric associated with the Killing form 
ds? = or ww, 


= (w)? + (W + (w?)?, 
= d0? + sin? 0 dy? + (dẹ + cos 0 dw)”. (8.28) 


For a discussion of the dynamics of rigid bodies using Euler angles, see the book 
Classical Mechanics by Goldstein [11]. 


8.1.4 Hopf Fibration 


In this section we discuss fibration structures over the projective spaces 
FP’, where F stands for one of the division algebras {R, C, H, O}, that is 
the real, the complex, the quaternion, and the octonion algebras. The classical 
Hopf fibration is the one associated with CP‘, but for pedagogical reasons, it 
might be more instructive to start with the simpler case of the projective line. 


8.1. ORTHOGONAL GROUPS 279 


Hopf map on RP! 


Let (go,q1) be coordinates in R? and consider the equivalence relation 
(qo,91) ~ (Aqo,Aqi). The projective line RP! is defined by the quotient of 
the plane with this equivalence relation, 


RP! = (R? — {0})/~ 


Geometrically, RP! consists of the space of lines through the origin in R?. The 
coordinates (qo,q1) are called homogenous coordinates of RP’. Keeping in 
mind that what really determines a point on the projective space are the ratios 
of the coordinates, we can cover the manifold with two patches corresponding 
to {qo/a, qı # 0} and {q1/q0, go # 0}. We subject the coordinates to the 
restriction 
a ee 
qg +q = 1. 


The equation represents a unit circle $1 centered at the origin in R?. The circle 
can be parametrized by 


qo =cos@, qı =sind. 


The restriction implies that |A| = 1. Topologically, the set of such A’s is the 
0-sphere S° = {1,—1}, and has the structure of the group Zə. Every line 
through the origin intersects the unit circle in exactly two antipodal points 
which determine the same line, so we have a fibration 


Z2 St + St. 
The Hopf map 7 for this fibration is defined by 


(40,91) = (24041, |go|” — lai") 


Of course, the absolute values in the equation above are redundant, but they 
are included here for motivation for the other Hopf fibrations. If we associate 
(qo, q1) with a rotation matrix in SO(2), the reader will recognize this map as 
the representation of rotations by half-angles 8.1, 


2 2 
go al «=. |q —4i 2¢0% 
qı qo qoqi %74 


If we were to try to define a rotation by quaternions in two dimensions, this 
would be it. Let ¢ be the coordinate in R representing the stereographic pro- 


jection of a point (x,y) € S! “8, R. We recall from equation 5.66 that 


BE -1 
= (aaa) 


If we now let Ç = qo/qi and simplify the expression above, we get 


t=290n, y=G-G, 


280 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


which are precisely the coordinates of the image of the Hopf map. In other 
words, we have a remarkable relation between the Hopf map and the stereo- 
graphic projection of the base space given by, 


m(qo,41) = T3 (40/41), qı #0. 


If Çı is the coordinate patch associated with the stereographic projection from 
the north pole and ¢2 the patch associate with the south pole, then in the 
overlap region the transition functions that glue the fibration are given by 


b=% 


Hopf map on CP! 


The matrices that represent elements of SU (2) can be written in terms of 
the Cayley-Klein parameters 


-|e £ 
gafa 
where, 
a = ett)? cos 8, (8.29) 
B = eb-9)/2 sin 8, (8.30) 


With apologies to the purists, we leave out a factor of 7 in the 8 parameter in 
this section. This is done for the sake of better consistency with other formalism 
that we need for this discussion. We present the Hopf map in terms of Euler 
angle rotations, but we could just as easily use Hamilton quaternion variables. 
Since det Q = 1, we have, 

lol? +18? =1 (8.31) 


This can be corroborated immediately since, 
2 2 2 8 ot 2 O 
lal" + |8|° = cos” 5 + sin” § = 1 
We write (a, 8) € C x C in the form, 
a=a'4+iz?, b=? +ia'. 


As a vector space, C? S R4, so equation 8.31 gives parametric equations for a 
unit sphere S$? € R4. 


(z+)? + (er)? 4 (a?)? a (a4)? Se 


In other words, the set of unit quaternions U(1,H) is a sphere $° in analogy 
to U(1,C) which describes a circle St. The classical Hopf fibration (or Hopf 
bundle) is the map, m : S3 — S$? given by, 


Tr(a, 8) = (2a8, |a|? — |B?) C C x R SRŽ, (8.32) 


8.1. ORTHOGONAL GROUPS 281 


or, in matrix form 


z 2#-—ty| _ fja? -— 18l? 2a 
xtiy -z| 2a8 |B? — |a]? 


Indeed, for all a and 8, the image of this map is in S? because 


|7(a, B)|? = 4|a/*|B)? + (lal? — 181), 
= (jal? + 8P), 
=1 
Any other point (a’, 8’) that maps to the same point 7(a,f) must satisfy 
(a’, B’) = (Aa, AB) for some complex number with |A|? = 1. When this hap- 
pens, we say that these points are in the same equivalence class. Then, the 
projective space defined as, 


CP = (C? — {0})/~ 


represents the space of complex lines through the origin of C?. The complex 
projective plane CP' has the structure of a compact complex manifold of com- 
plex dimension 1. Geometrically, it can be viewed as sphere S$? in which antipo- 
dal points are identified. In quantum physics, and quantum computing, this is 
called the Bloch sphere. Points in CP! can be described by homogeneous coor- 
dinates (a, 8) as representative of the equivalence classes, or by inhomogeneous 
coordinates 


(1= 5 640, or =F, a#0. 


Figure 8.2 depicts a somewhat misleading but still useful visualization of 


Fig. 8.2: CP. Intersection of complex lines with S? are $1’s. 


the construction of CP’. The horizontal and vertical axes are copies of C 
parametrized by a and 8. Since a complex line is really a plane, the cross 
product is 4-dimensional. The unit sphere centered at the origin is given by 
the equation |a|? + |8|? = 1, so this is a three sphere S°. The intersection of 
a complex line with S$? is a circle S1. The only point common to two different 
lines through the origin is the origin, so the corresponding circles of intersection 
are disjoint. The collection of all these circles is parametrized by a two sphere 


282 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


S?. In other words, if p € CP’ & S? then 1~!(p) is a circle S! € 93. The 
circles S! are called the fibers of the fibration. 


T: U(2)/U(1) = $3 © 82. (8.33) 


What is not at all obvious is that any of these circles is linked exactly once with 
any other circle of the fibration. To “unpack” this fibration in familiar terms, 
we first compute the map in Cayley-Klein coordinates, 


2a8 = fe t9)? cos B]fe~*—9)/? sin 8], 
— Jib 0 cin 8 
= 2e’? cos 5 sin 5, 
= e’? sin 0, 
la|? — |8|? = cos? £ — sin? $, 
= cos 6. 
Let w = 2a8. Identifying C x R with R3, that is, taking z = R(w), y = S(w), 
we get a point in S? in spherical coordinates 
x = cos ọsin 0, 
y = sin ġsin9, 
z = cos. 


Let ¢ be the complex number in equation 5.61, whose inverse image under the 
stereographic projection 7, gives the coordinates on the sphere, 


C+ 6-¢ £) 
C+ ilI CE1 


@u2)=( 


Setting = a/6 to be the inhomogeneous coordinates of CP’ and simplifying 
the double fraction using the fact that |a|? + |8|? = 1, we get the Hopf map 
8.32. That is, a point 7(a, 8) in S? given by the image of the Hopf map is just 
the point in S? obtained by the inverse stereographic projection 


As stated above, the fiber of a point in S? is a circle S1 in $°. The fibers of 
points on a circle in S? parallel to the equator, are linked circles that lie on a 
torus - these are called Villarceau circles. Geometrically, the Villarceau circles 
are obtained by the intersection of a torus and a plane tangential to antipodal 
images of the generating circle. Hopf discovered the fibration in 1931, but I 
only learned about Hopf fibrations in 1975 from studying Taub’s solution to 
Einstein’s equation. Taub’s metric has topology R x S? and spatial SU(2) 
symmetry. The Taub metric is of the form 


ds? = —dr? + Ny (T) @ w! (8.34) 


8.1. ORTHOGONAL GROUPS 283 


Fig. 8.3: Hopf Fibration 


where w is essentially the Maurer-Cartan form of su(2). This is an example 
of a gravitational instanton. However, the first time I saw a pictorial rep- 
resentation of the fibration was a magnificent hand drawing made by Roger 
Penrose discussing Robinson Congruences in the context of twistor theory; a 
reproduction of this drawing appears in [29], for example. One has to marvel 
at the earlier masters who were able to visualize this complex structure. For 
us lesser humans, nowadays it takes little effort to render the images with a 
computer algebra system, by lifting a parallel circle in S? to S, followed by a 
stereographic projection from $° to R. The nested toroidal S1 links in figure 
8.3 are the fibers of three circular parallels in the base space $?. The reader 
will find a beautiful explanation of Villarceau circles in a paper by Hirsch [15]. 


There is a generalization of Hopf fibrations with $1 fibers to all com- 
plex projective spaces, CP”. Consider the space C” with coordinates Z = 
(21, Z2,---;2n)- The equation |z,|? + |z2|? + -+ + |z,|? = 1 describes a sphere 
g2r+l c C”. The space CP” of complex lines through the origin. Let 
a,b € S?"+!_ Define an equivalence relation a ~ b, if there exists c € S1 
such that a = bc. The idea is that a complex line through the origin is a 2-real 
dimensional plane, which intersects the sphere on a circle. All points in such 
circles are in the same equivalence class. CP” can be identified with the space 
C”/~. The generalized fibration is, 


all 
pat 2 Opt (8.35) 
The corresponding fibration in the real case is 
S” £ RP”, (8.36) 


since a line through the origin intersects the sphere S” in two antipodal points. 
The group elements are the identity and the antipodal map. 


284 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


Hopf map on HP! 


The construction of the Hopf fibration over quaternion space, follows along 
the same lines. Let (q1,q2) be quaternion coordinates in H? ~ R8, with 


Sol 24s 3; 4 
q =x +r +i+r’j+zr*k, 


n 643 T, 8 
q2 =x? +x?’ +i+rzr'j+r’k, 


Introduce complex coordinates, 
z1 =z! +i, 23=2°+ 2%, 


za = £? + aqti, z4 =r’ +25, 


so that 
qı = 21 + 22), 
q2 = 23 + Zaj. (8.37) 


Consider the equivalence relation (q1,q@2) ~ (Aq1, Ag2),; À € H, and define the 
quaternionic projective space 


HP’ = (H?—0)/~. 


As before, (q1, g2) are homogeneous coordinates representing equivalence classes 
of quaternionic lines. The space can be covered by two inhomogeneous coordi- 
nate charts 


qı q2 
T q2 # O, or G=-, qı # 9. 
q2 qı 


C= 
We impose the condition, 


8 
lal? + lgl? = 5 |z|? =1, 
k=1 


which represents a sphere 97 € H?. This implies that on the sphere S7, the A’s 
are unit quaternions, that is |\|? = 1. Thus, the fibers are 3-spheres, and we 
have a fibration 

Si Ss > st, 

S? > Sp(2)/Sp(1) 4 Sp(1). 
The Hopf map 7 : S7 —- S* is defined by 

m(q1, 42) = (29142, |a1|? — |q2|”) € Hx R ~ RŽ 

These look like the usual suspects. Let €,7 € C so that 


€+njeH, 


8.1. ORTHOGONAL GROUPS 285 


and set 


E+nj=2n%, 


a= lal? cy |q2|? 


We can then arrange the Hopf map in familiar (quaternionic) matrix form 


| Z anen g = k -= |a}? 29192 
Soe -z 201 92 la|? — lı]? 


We can easily corroborate that (£, n, z) represent points in S*, again by the 
familiar process 


El? + In]? + l2? = 2al? + (lanl? — 21°)’, 
= |ai\* + 2la1/?lael? + lal’, 
= (la| + gel)? = 1. 
One can be a bit more explicit, inserting the complex coordinates 8.37 and 
carrying out the short computation. We get 
E = 2(2123 + 2222), 
n = 2(2223 — 2124), 
z = |a|? + |zel? — les? — |24]? 
At this point it should not be surprising that if one denotes by Çı the quaternion 


representing a point on S4 under the stereographic projection 7, from the north 
pole to H ~ R4, then the quaternionic Hopf map is related to this projection 


by the 
™(qo, 1) = i (2) 
qı 


If Çı and ¢2 represent charts overlapping over a narrow band around the “equa- 
tor” under the projective maps from the north and south pole respectively, then 
on the overlap the transition functions are 


Çı 


t= 


and its inverse on the other direction. 

For now, we will stay away from the octonion algebra because it is not 
associative, however, there is also an projective octonion line version of the 
Hopf map. The results are summarized in the following list, 

S84 S! — St S RPF, 
S! = S — S? 1S CP}, 
S => S — St S HPH, 
S = 8 — S8 S OP". 


286 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


The Hopf fibration is a seminal discovery in algebraic topology because, through 
a formalism called the long, exact, homotopy sequence of a fibration, it became 
possible to establish the existence of the first, non-vanishing, high dimension 
homotopy groups of spheres. The long exact homotopy sequence applied to 
S3 4, S? yields the result 


m(S?) & n°($?) =Z. 


The Hopf fibration associated with CPt = P'(C) describes a singly-charged 
Dirac monopole, and the fibration associated with HP? enters in the descrip- 
tion of a Yang-Mills instanton. These are explored in chapter 9 after properly 
introducing connections on principal fiber bundles. 


8.1.5 Angular Momentum 


As indicated in the preface to this book, we present a simplified and lim- 
ited version of basic quantum mechanics for the benefit of those mathematics 
students who have no formal training on the subject. Quantum Mechanics was 
developed in 1926 by Schrödinger and Heisenberg. The axiomatic description 
here is a summary of the framework as envisioned by Dirac and von-Neumann. 
The axioms really are axioms in the sense of Euclid; they cannot be proved. The 
axioms are not self-evident, but they are founded on experience and physical 
intuition. 


e Postulate QM1 The state of a particle is described by a wave function 
w(t,x,y,z) in some complex Hilbert space H with inner product ( | ). 
The quantity, 


y*ypdr 


represents a probability density of finding the particle within a volume 
element d?r. The total probability is, 


P= J yp*yťr=1. 


e Postulate QM2 Measurable (or observable) quantities such as energy and 
momentum, are represented by a linear Hermitian operator L acting on 
yw. The measurement of the state is given by the expectation value, 


(Y) = (YIL), 
= n y* Lpr 
H 
e Postulate QM3 From the spectral theorem for Hermitian operators, the 
possible outcomes of the observables are the eigenvalues of the operator. 


The eigenvalues of real and eigenstates corresponding to different eigen- 
values are orthogonal. 


8.1. ORTHOGONAL GROUPS 287 


The position operator is multiplication. The linear momentum and energy 
operators are obtained intuitively by starting with a classical solution to the 
wave equation. In dimension one, the quantity, 


y= Aet krat) 


is such a solution with speed v = k/w. The energy E and momentum p are 
related to the wave number k and the angular frequency w by the equations, 


p=hk, E= ñw, 


so that, 
y= Ack P-E, 


Taking partial derivatives with respect to x and t respectively, we get, 
o i 
Ag” = Kr Y, 
o i 
Sp ie 
ae” h Y» 
By ansatz, the choice for the operators is, 


ha 


4 Ox’ 


Px 


Generalizing momentum to dimension 3, the operators become, 


h 
p= —V, 
7 
` ð 
E=th—. 
OE 


The operator for total energy operator called the Hamiltonian H, is the Kinetic 
energy KE = p?/(2m) plus the potential energy PE = V. Thus, Schrödinger 
was led to the quantum mechanics equation, 


Hy = Ey, 
R2 _o Oy 
—a Vw + Vb =the. (8.38) 


For a free particle for which the energy does not depend on time, the stationary 
states are described by the discrete set of eigenfunctions and energy eigenvalues 
of the equation, 


where 


288 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


and Yn depends only one the spatial coordinates r = (x1, £2, £3) = (£, y, z). Let 
P = (Px, Py, Py) be the components of the momentum operator. The following 
basic commutation relations hold, 


[tit] = 0, 
(Pi, pj] 5 0, 


The first commutation relation above is trivial since in number multiplication, 
the order of the factors does not alter the product. The commutation relation 
for two momenta follows from the symmetry of indices of second order partial 
derivatives. The third commutation relation can done for each pair of indices. 
In most elementary quantum mechanics books, one example is worked out and 
the rest are taken on faith or left as an exercise. Here is one example. Let f be 
an arbitrary function; compute, 


o o 
po) =- (Pen es), 
= —iħ(f +£ fr — Tfr), 
= ihj, 
[Px , x] =—th 


At this stage, having gained experience in manipulating indices, it is just as 
easy to do all the cases at once, 


o ð 
NER Ox; i of of 
= n (iy H Gi r igh) ; 
= —ihoi; f 
[pi, £3] = —iħðij. 


The definition of the angular momentum operator in quantum mechanics is 
given by simple extension of the classical formula, 


L=rxp. (8.40) 
The explicit components of angular momentum are, 


Ly = ypz — ZPy, 
Ly = 2Pz — Pz, 
Lz = £py — YPe- (8.41) 


In index notation, 
Li = eif ripp. (8.42) 


8.1. ORTHOGONAL GROUPS 289 


Let us derive the following commutation relations involving angular momentum, 


(Li, zj] = iħeijTk, 


[Li, pj] = ihe" ijPr, (8.43) 
For the first equation, we can demonstrate an instance, 


[Le y]f = (ypz — zpy)(y f) — Y(Ypz — zPy) f, 
= y°pzf — zpylyf) — Y’ p-f + yzpyf 
= —z[py y] f, 
[Lz y] = ihz. 


But, since we have already introduced the Levi-Civita symbol 2.41, we can use 
the momentum commutators 8.41 and have fun doing all the cases at once. 
Here is the computation, 


[Li, vs] f = [ei apn £il f, 

=e april f) — aye Ek Pm) f, 
pee” Pmj f) — EjPm(f) 
Em Dewees 


5 —iħrpei” mj, 


Il 


II 


TkEi 


= —iha,e* ;, 
= iħzpge"ij 


The formula for the commutator [L;, p;] is very similar and we leave it as an 
exercise. Instead, we go for the gold of the commutators. 


8.1.1 Proposition 
[Li, Lj] = the* i; Lp. (8.44) 


As above, we show that the concept is easy by doing the following case 


[Les Ly] = Lal(zPy — az) — (zPr — aps) Lins 
= (LrzPa — 2P,L,) — (Leep, —xp,Lz), 
= (Lrz — zLe)Pr — (Lpz — pzLz), 
—thypy, + thrpy, 
= ih(xpy — YPz), 
= ihL,. 


II 


In the third line above we used prL = Lepr, Lex = «Lz. To do all cases 
at once, one needs the product of permutation symbol identities 2.46. This is 
a great exercise in index gymnastics but hides the simplicity of the two other 
independent cases that can be done as above, 


290 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


Equation 8.44 is the primary reason why the topic of angular momentum is 
included in this section. We will work in units of h, that is, we set h = 1. Then 
comparing with the commutator relations 8.14 for the Pauli matrices, we see 
that apart from a factor of 2, we see that the components of L are generator of 
the lie algebra su(2). Following the QM postulates, we seek a Hilbert space on 
which angular momentum acts as a linear operator, to find a function space that 
provides a representation for the algebra. Naturally, we seek such functions over 
a two sphere as the base space. The standard process one finds in most classic 
books on quantum mechanics might look a bit mysterious to those who see it 
for the first time, but as we will demostrate, it is just an implementation of the 
Cartan subalgebra for the angular momentum representation of the algebra. 

In the case of su(2) there is only one generator, in the Cartan subalgebra, 
which we choose to be L,, so, the rank of the algebra is one. We look for 
a representation in which L, is diagonal. We also seek a Casimir operator, 
namely, an operator that commutes with all basis elements of g The number 
of Casimir operators in a semisimple Lie algebra is equal to the rank of the 
algebra. The candidate for the Casimir operator for su(2) is, 


L? = £2 + L + Li. (8.45) 
We show that the operator commutes will all three generators, that is. 
[L’, Lz] = [L?, Ly] = [L’, L,] = 0 
Let’s show for instance, that [L?, Lz] = 0. We need to establish that 
[L2 + Ly + L2,L,] =0. 
First, it is easy to verify by direct computation of both sides that in general, 
[A?, B] = AA, B] + [A, BJA 
Applying this identity to the square of the components of L, we get, 


L2, L; = Lz|Lz, Lz + Lz, Lez\ Lx, 
= -iL,Ly —iLyLz, 
L3, Lz] = Ly[Ly, Lz) + [Ly, Lz|Ly, 
=iLy Ls +iL,Ly. 
08 


Adding the last three equations yields, 
[L?, Lz] = L? + L? = r, Lz], 
= -iLgly — iLy Le + iLyLy +ilgLy, 
= 0. 
We introduce the ladder operators, 
Ly = Lg +iLy, 
L_ = Lz — ily, (8.46) 


8.1. ORTHOGONAL GROUPS 291 


We will see below that the ladder operators are the raising and lowering opera- 
tors of the algebra. The process of finding the irreducible representations starts 
with establishing a number of commutation relationships. First, it is easy to 
verify that, 


=. 72 2 
LL- =I} +i? + Lz, 
2 2 
LL, = L +1? = Ly. (8.47) 


Indeed, we have, 
L,L_ = (Ly +ily)(L, —iLy), 
= L3 + L} + i(LyLe — LL), 
= L} + L + iļLy, Le), 
= L} + L? + Lz, 


and similarly for L,L_. It is also easy to verify that, 


[EuD] Eiz 
[Lz, L4] a L4, 
[L;, L] =-L. (8.48) 


These commutators define the Cartan subalgebra. Since in this case, the subal- 
gebra is one-dimensional, the roots are the vectors in R given by r® = (1) and 
r} = (—1). We associate the 0 root with L,. The root diagram called A; is 
a simple as it gets; it consists of two unit vectors at the origin in R. Putting 
the commutator results above together leads to the following formula, 
I4 = hab +22 -L,, 
=L_L,+12+L,. (8.49) 
The result follows directly from equation 8.47. We show the steps for the first 
of these. 
_ 72 2 
LL- = L} + Lg + Lz, 
=L’ — L + L,, 
ES feb. + ib, 


The formalism can then be used to obtain the ubiquitous expression for the mo- 
mentum operator of a single particle in spherical coordinates. The computation 
starts with inverting the Jacobian matrix in 2.30, 


a ə ı ə sing ð 


z sin 0 cos om, te cos 8 cos GET ~ Fsind 36° (8.50) 
a) opus pO yo . 3 coso ð 

T sinsin a cos # sin $5 + rsind 06’ (8.51) 
ð QO 1..,0 

ae cos Oa E sin 05g (8.52) 


292 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


After some algebraic manipulations, we get, 


L,= iz, 
L= erie (+3 + icot 0%) 5 
1 3,. of 1 Of 
” | 
L* = gen) F Gn? 0 Og (8.53) 


Comparing with equation 3.9 we recognize L? as the angular part of the spher- 
ical Laplacian. Perhaps the most direct way to get the function space is to solve 
Laplace’s equation in spherical coordinates by separation of variables, that is, 
assuming that the solution is of the form w = 0(0)®(¢). The complete process 
of obtaining the solutions is best suited for a course on partial differential equa- 
tions or a course in electrodynamics, so we just present the result. The equation 
is manifestly self-adjoint, hence the eigenvalues are real and eigenvectors cor- 
responding to different eigenvalues are orthogonal. The eigenvalue equations 


are, 


LzYim(9, 9) = MYi,m(9, >) 
LPY; m(0, ) = UU + 1)¥im(9, $). (8.54) 
Here, eigenvalues /(/+-1) are the orbital quantum numbers l are positive integers. 


The solution of Laplace’s equation in spherical coordinates are the well-known 
spherical harmonics Yim, 


o 1 21+1(l—m)! ne 
He 5 Caan ere ; (8.55) 


where P,,, are the associated Legendre polynomials given by Rodrigues’ for- 
mula, 


Yim 


Ptm(cos 0) = (E ag a e ia (sin” 0) (8.56) 
wa aru d(cos 8) f f 
For each / there are 2l+1 possible values form given by the integers m = —l...1. 


The eigenfunctions ~ can also be obtained by a method which is almost 
entirely algebraic. First, the eigenvalue equation for the z-component of angular 
momentum is, 


Lzy = ^y, 
E 
The solution is 
= fae’. 


For the function to be periodic on a sphere, we must have A = m € Z. The 
normalized eigenfunctions of the ¢ component are, 


(p) = —~e""?, m=0,+1,42..... 


8.1. ORTHOGONAL GROUPS 293 


Since L? — L? = L2 + L? is physically a positive operator, the absolute value 
of the eigenvalues of L, for a given L? are bounded. Let l be the integer 
corresponding to the largest value of L, for a particular value of L?. We switch 
to the bracket notation and denote the eigenstates by 


Y = |l,m). 


Let Ym be an eigenstate of L, and recall from equation 8.48 that [L,, L+] = L+. 
Apply L, to the state Li. We get, 


L,(L4|l,m)) = (LiL, an [Lz,L+])|l,m), 
= (LiL, + L+)|l,m), 
= (m + 1)(L4 |l, m)). 


Thus, if Ym is an eigenfunction with eigengvalue m, Li, are eigenfunctions 
with eigenvalues m+ 1. The ladder L+ operators lower or raise the m quantum 
number without changing the eigenvalue of L?. Hence, if m = / is the maximum 
value for a particular state of L?, we have. 


Li =0 


Now, apply L_ to this state. Using formula 8.49 yields, 
Li L_w = (L? — L — LJ = 0. 


But we are seeking states which are simultaneous eigenfunctions of the com- 
muting operators L? and L,, so the eigenvalue of L? must be (+1). For those 
who have studied the solution Laplace’s equations by separation of variables 
and infinite series, this would correspond to the step where one sets the eigen- 
value of Legendre’s differential equation to I(1 +1) to cause the infinite series 
to terminate, and thus yield polynomial solutions. In this manner, we have 
recovered the eigenvalues in 8.54 almost entirely algebraically, 


L,|l,m) = mll,m), 
L?|l,m) = I(l ++ 1)|l,m). (8.57) 


Eigenstates are basis vectors for the Hilbert space, so they should be normalized. 
Thus, we require 
(1, m|l,m) = 1. 


Suppose we have constants C;,,, such that, 


Y z L|l,m) =G |l, m + 1). 


l,m 


Then, (¥|¥) = |Cj;,,|?. On the other hand, we have, 


Lll, m) = (L? — L? F L,)|l,m), 
= (a(l + 1) — m(m a 1))|l,m), 


294 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


so we choose, 


OF = Ji(l+1)—m(m+ 1). 


sn 


We conclude that the effect of applying the ladder operators to normalized 
eigenstates is 


Lall, m) = JI +1) — m(m+£ 1)|I,m). (8.58) 


Recalling that L, is a linear first order differential operator, and noting that 
L4|1,l) = 0, we get a linear first order differential equation for ©(0) that is very 
easy to solve. Then, carefully banging the solution with lowering operators leads 
to Rodrigues’ formula for the associated Legendre polynomials. The matrix 
elements of the representation are complicated. They are described by unitary 
(21 + 1)-dimensional unitary matrices called Wigner D-matrices. If R(w, 6,0) 
is a rotation by Euler angles, and |l, m) are spherical harmonic eigenstates, the 
matrix elements are given by, 


D mm (Y, 6, Q) = (1, m'|R(w, 0, @)|l, m), (8.59) 
= e ™ bd! mm (Oem, (8.60) 

where, À 
d'mm (0) = (l, m' |e |1, m) = D'mm (0,0,0) (8.61) 


We content ourselves in these notes in wetting the appetite of the reader to 
dig into more details in any senior/first-year graduate level text in quantum 
mechanics. 


8.2 Lorentz Group 


The appropriate symmetry group in special relativity is the Lorentz group. 
This is the group of transformations that leaves invariant the metric 7 = 
diag(+ — ——) in Minkowski’s space M3. We will denote Lorentz transfor- 
mations by the notation, 

rH = L pg”. (8.62) 


The metric is invariant if, < Lx, Lz >=< x,x >, that is, 
Tw LoL 8 = nag. 


Transformations for which |L#,| = 1, and L°o > 0, so that past and fu- 
ture are not interchanged constitute the proper, orthochronous Lorentz group 
SO* (1,3). The Lorentz transformation laws for tensors T is the same as in the 
Riemannian metric case as shown in equation 7.16 


1B1> br Ox! F1 Ox'®r Aa¥1 Ox’ mui,- a 

T arnas = Bart + Barr Barer ++ Bares Ty” (8.63) 
As usual, the metric is used to raise and lower indices and thus convert between 
covariant and contravariant tensors. Another way to obtain a covariant tensor 
from a contravariant one, is by use of the permutation symbol, but we need to 
be a little careful. As noted in the paragraph elaborating on the Hodge star 


8.2. LORENTZ GROUP 295 


operator 2.87, the Levi-Civita symbol does not transform like a tensor, but 
rather, like a tensor density of weight (—1). Instead, if g is any Riemannian 
or pseudo-Riemannian metric such as 7 we define epvkà = det g Ever, Which 
does transform like a tensor called the Levi-Civita tensor. There are general 
formulas similar to 2.46 for contractions of the Levi-Civita symbol for any di- 
mension. The pattern can be inferred from the explicit formulas for dimension 
four listed below, 


pVKX — gpUKA 
E Eapys = Owprys> 


6 
cS eap = FE 
6 
hP eapys = 260, = 2(d863 — h Ôa)» 
PPV 6 a5 = 318E. (8.64) 
An appropriate contraction of a tensor T with the Levi-Civita tensor gives 


another tensor called the dual tensor T. Thus, for example, the dual of an 
antisymmetric tensor Tuy is 


9 1 
TAP r > ART gi 8.65 
2y/det g g ( ) 
A tensor as above for which T y = ve is called self-dual. Such self-dual tensors 


play a special role in the representation of the Lorentz group. 


8.2.1 Infinitesimal Transformations 


There are 6 infinitesimal generators for the Lie algebra so0(1,3). We can take 
these generators to be, 


O O F|o 
O e. O| © 
O-O OGD 

I 

w 

l 
HEO gG 


The three generators {j1, j2, j3} correspond to the subgroup SO(3) of spatial 
rotations and thus span the subalgebra so(3). The exponential map of these 
generators yield matrices in which the spatial 3 x 3 blocks are the same rotation 
matrices as in 8.1.2. There are three other generators which we call {k1, k2, ks} 
involving the time parameter These generate the boosts. The k generators are 
not manifestly antisymmetric, but that is because the signature of the met- 
ric. The exponential map of the boost generators yield hyperbolic blocks. For 
example, the kı infinitesimal transformation in the t and x coordinates, 


E-m lb dE 


296 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


yields a Lorentz transformation of the form, 


cosh sinhé@ 0 0 
sinh cosh 0 0 
0 0 1 0 
0 0 0 1 


The transformation does represent a rotation, but it is imaginary. Here 


1 p 
cosh 0 = ————.,,_ sinh = -—_—__., 
\/1— p? y1- 8? 
with 6 = v/c. This is the way these transformations appear in a first course 
in special relativity. Any infinitesimal Lorentz transformation x’ = L”„x” has 
the form, 
Lt, = of, T why, or, 
Luv = Suv + Ow: 
To preserve the metric, we must have, 
Ty Eo LY g = Nw (SMa + wa) (8g + ws), 
= Tap + Wap + WBa, 
= Nag- 


This means that wag is antisymmetric, as expected from the analysis of the 
similar situation with the exponential map of the rotation group. In any repre- 
sentation of the Lorentz group, the infinitesimal transformations take the form, 


I+ iu” Mu, (8.66) 


where M,, are the antisymmetric matrices representing the six infinitesimal 
transformations. These matrices obey the integrability commutation relations, 


[Mv, Mor] = My Mor = MorMy, 
a Gvo Myr 7 Guo Myr + Gur Me E Gur Mpo. (8.67) 
Apart from a factor of iħ, this is the relativistic angular momentum tensor, 


symbolically written, r^p, where r and p are the position, and the 4-momentum 
respectively. The notation means that 


MY = gp” — pg” 


The spatial components Jy = €k! Mi; are the generators of the subgroup SO(3), 
and the spatial-temporal components K; = Mo; generate the boosts. The Lie 
algebra commutator relations are given by 


[Ji, Jj] = ie” ij Jk, 
[K;, K;] x —ie* ij Jk, 


8.2. LORENTZ GROUP 297 


where, 
J= ło. 


For a spin Z, the boosts are generated by, 


K= +io 


2 ’ 


giving two inequivalent representations. 

These matrices constitute a representation of the Lie algebra of the Lorentz 
group called the (4,0) ® (0,4) spin representation. The group elements are 
given by the exponential map, 


L”, = exp [iw (Mag), ] ; (8.69) 


8.2.2 Spinors 


In a manner analogous to the construction of the 2-1 isomorphism between 
SU(2) and SO(3), starting with the map 8.8, we seek a representation of Lorentz 
transformations of a vector x“ € Mı 3 in terms of transformations of the 2 x 2 


Hermitian matrix X = X4®, Since det X is equal to the norm of a vector that 
we wish to preserve, the condition is equivalent to invariance under unimodular 
transformations 


X' = QXQİ, (8.70) 


where Q € SL(2,C). 
We introduce spin space to as a pair {S2,€4pB}, where S is a 2-dimensional 
complex vector space and eap is the symplectic form with components, 


ae 5) (8.71) 


The matrix elements of the symplectic form are the same as the Levi-Civita 
symbol in dimension 2. It is assumed that spinors obey the transformation law, 


$'a =B? 4, (8.72) 
where Q € SL(2, C). An element ġ4 € S2 is called a covariant 2-spinor of rank 
1. Associated with S2 there are three other spaces, the dual S3, the complex 
conjugate S2, and the complex conjugate dual S2°. We will use the following 
index convention, 

PA = S2 ’ ^ € S3 , 

pi E 52, ge S2. 
We introduce dual and conjugate versions of the symplectic form €4 g, eA? etc,. 
all of which have the same matrix values. We use these to manipulate spinor 
indices according to the rules, 


oa = ease”, p? = pae”, (8.73) 
ġa = €a”, p? = pah? (8.74) 


298 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


namely, we lower on the left and raise on the right. The two operations are 
inverse of each other as we can easily verify by lowering , then raising and 
index, 


A C\_BA 
p“ = (encp)? t, 
BAC 

=EBCE to) ) 


= 406° = p4. 


Here we have used the permutation symbol identity which is the same as for 
the Levi-Civita symbol, 
BA _ 5A 
ERGE =ô Cc. 
Higher rank spinors can be constructed as elements of tensor products of spin 
spaces. Because of the antisymmetry of the symplectic form, one needs to be 


more careful when raising and lowering spinor indices. For instance, Consider, 
b4wa E S2 @ Sž. Then, 


dpa = p eant”, 
= aB Y”, 
= ena, 
= —ġgY”, 
= —ġaŅ^. 


We conclude that exchanging the position of a repeated spinor index introduces 
a minus sign. In particular, for any rank one spinor ġ4, we have 


boa = bag’ =0. 


It follows that the full contraction such as ¢4?°¢4p¢ of a spinor with odd 
number of indices with itself is zero. If a spinor is symmetric on any two 
indices, then contracting on those two indices gives 0. Contraction on two 
indices reduces the rank of a spinor by two. If a spinor is completely symmetric, 
it is not possible to reduce the rank by contraction since any such contraction 
is zero. The only completely antisymmetric spinor must be of rank two since 
the indices can only attain the values 1 or 2 and there are only to permutations 
possible. In fact, a completely antisymmetric spinor must be a multiple of €4pB. 
Thus, for example, for any spinor 64, we have 


eaBoc + €cabs t+ epceba = 9 


because this combination of spinor is antisymmetric and of rank 3. Another 
good example is the relation, 


pach? sn = —¢a° bcB 


Which is true since the position index C was exchanged. Hence the quantity 
on the left is an antisymmetric spinor of rank two and it must be a multiple of 
cap. It is easy to check that the correct multiplicative factor is given by, 


c 1 CD 
pac’ p= —zbcD EAB- 


8.2. LORENTZ GROUP 299 


We can also have spin-tensors, the main example being the “connecting 
spinor” ae in 8.10, which has one covariant tensor index and two spinor 


indices. For convenience, we list the components here again, 


tat ht ol oho 3) 


It is elementary to verify that the inverse matrices o” , 4, that is, the matrices 
such that, 


wo AB _ su 
a ipao = OF, 


The result is consistent with rasing the tensor index with the metric and low- 
ering the spinor indices with the symplectic form. With these matrices, the 
reverse of equation 8.9 is, 


are given by, 


1 a 
gh = z7 apt. (8.75) 


A few words about index conventions are in order. The convention used here 
most closely resembles that of the early developers Veblen and Taub (See for 
example [36]). The main difference here is in choosing o,,4% to be consistent 
with Pauli matrices; this is closer to the choice in the Penrose prime notation. 
In following established protocol, I have reluctantly adapted the notation with 
both indices up, which is inconsistent with the index summation convention 
for matrix multiplication of index-free expressions such as 0102. My preference 
would have been to choose ae pġ to correspond to the Pauli matrices. Instead, 
lowering the second index results on the following matrices, 


ae Caceres 


It is straight-forward to verify the following spinor identities, 


AB 


u . — 954 5B 
o! Cucp = 2666 


D?’ 
SOP ae 
Ty ABghOP — IAC EBD 


ae + Ge Cee = Doe: (8.76) 


Remembering that o matrices are Hermitian, and switching the position of 
summation index B, the last equation above can be rewritten as, 


A =B AE et BS A 
On pov ato Bon è = —2GuIC- 


This equation is in the right summation index format for matrix multiplication, 
so it can be written in an index-free form as, 


ote” + ota = —Igh”T. (8.77) 


300 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


In view of the remarks above, the reader should be cautioned that the matrices 
in this neat equation are not the standard Pauli matrices in the coordinate 
representation we have chosen. Equation 8.77 is most important in formulating 
Dirac’s equation, as shown below. 

Analogous to the situation for SO(3), For every proper Lorentz transformation 
LE, there are two unimodular matrices Q and —Q such that 


o! = Qo QILE. (8.78) 


A similar statement holds for improper Lorentz transformations. In this case, 
we have 
o” = Q QTI LE. (8.79) 


As discovered by Dirac, to obtain a relativistic extension of Schrödinger’s 
equation in the spin 1/2 representation, which is invariant under Lorentz trans- 
formations, one must introduce a second spinor field 74. The field equations 
are, 


a G5) pP = —imey ^, (8.80) 


where the o spin-tensors satisfy equation 8.77. The more familiar 4-spinor Dirac 
equation is obtained by rewriting 8.80 in matrix form, 


h ð 0 —iot^ p] [68] _ oA 
a lint, 0 “| [$] THS [fal 


which is the standard Dirac equation for a free particle, 
(ypu — me)Y = 0. (8.81) 


Here, 


is a Dirac 4-spinor, and y has the 4 x 4 matrix representation, 


0 —iot^ s 
a Een 0 |: 


Comparing with equation 8.77, we see that the y’s satisfy the so-called Clifford 
algebra relation, 
YPY +A = 2g I. (8.82) 


We would like to establish some interesting connection between some spinor 
and tensorial quantities. Consider for instance a null vector l, € Mj,3. Since 
the length of the vector is zero, the corresponding matrix |, 3 has determinant 


8.2. LORENTZ GROUP 301 


equal to zero; so the first row, is a multiple of the second and hence the matrix 
is of the form, 


= AB b ads. 


= (6 
a = i ; 
then, up to a constant, the components of the null vector are given by, 
Ly s= o,” papp. 


Using the matrix components of the sigma matrices as in 8.10, a short compu- 
tation gives, 


If we choose, 


Normalizing, the vector becomes, 


Ee ( c+ (+0 ~] 
PREC Fa E e 
The spatial part precisely the inverse image of complex numbers ¢ under the 
stereographic projection 5.61, viewed as inhomogeneous coordinates ¢ = ¢!/¢? 
on the Riemann sphere S? % CPt. The norm of the spatial part is one, so 
the norm in Mı 3 is zero, as it should be. One may view the sphere as the 
intersection of the null cone with the hyperplane t = 1 (or —1), so this is 
essentially the celestial sphere. Spinors transform by elements of S L(2, C) which 
is the universal covering group of the group of Mobius transformations. Mobius 
transformations are conformal maps, so this gives a connection with minimal 
surfaces. 

Next, we note that self-dual tensors of rank 2 are associated with symmetric 
spinors. The connection is made by first defining the spinor, 


(8.83) 


= zlu OBE — Iv O Bè) (8.84) 


where the contraction ari a = 0 is clearly 0. By direct computation in the 
coordinate system we have chosen. The values of the traceless matrices are, 


0 —i -1 0 
oo1B a e, 0 ’ 02^ B T f i ’ 03 B a | 0 | , 


—i 0 0 —i 0 —1 
O12 = F | ’ 003 B = L ‘| $ 13 B = f 0 $ 


It follows that cavan is symmetric. The values are, 


-1 0 i 0 0 1 
O01AB = 0 1l? 002AB = 0 Gl? 003AB = 1 ol? 


0 2 —i 0 1 0 
012AB = z glè 023AB = A AE 013AB = 0 1° 


302 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


We find from the last set of equations that, 


12 


012 = 0 | = —1093, 
013 = o” = ioo, 
To = —0°! = inp, 
and hence 
guy = tery (8.85) 
2./det g 


is self-dual. From this, it follows that if F4? is a symmetric spinor, then 
Fu = Gee (8.86) 


is a self-dual tensor. It can also be verified by computation in the chosen 
coordinate system that. 


Owapor™*? = —26° 57, (8.87) 
From this, the equation above can be inverted, 


ae V (8.88) 


yielding a symmetric spinor from a self-dual tensor. This establishes a 1-1 
correspondence between self-dual tensors of rank two, and symmetric spinors 
of rank two. Tensors that transform like symmetric spinors of rank 2n are said 
to be irreducible under a spin n representation. This is particularly relevant for 
self-dual Maxwell tensors. More specifically, if, 


F = Fdz” ^ dx” 
is the Maxwell 2-form, the corresponding Maxwell spinor is, 
AÀ BÈ 
Fuape = on oy Fuv 


Here, (almost) following Penrose, we have used undotted and dotted letters 
with the same character, to avoid the proliferation of different letters. The 
Maxwell spinor can be written as, 


1 
Pape = 5(Paape — Fepaa): 


= Pape jp t+ OAREAB> 


where, dap and ¢ Ap are symmetric spinors given by, 


1 f 
PAB = Face 
bin = AB 


This gives a spinor decomposition of the tensor into self-dual and anti-self-dual 
parts. 


8.2. LORENTZ GROUP 303 


The matrices Gaye p also serve to connect infinitesimal Lorentz and spinor 
transformations. Given an infinitesimal Lorentz transformation as in equation 
8.66, the corresponding infinitesimal spinor transformation is. 


(I+ but” Myy)4 p = 643 + fu" (op) p. 


In other words, the spinor representation of the angular momentum tensor is, 
A Lo. A 
(Mw) g= 3I B: (8.89) 


Index manipulation of Dirac 4-spinors VW" is done by an extension €,, of the 
symplectic form to a 4 x 4 matrix which is also antisymmetric. In an accepted 
abuse of language, some authors refer to €,,, as the metric spinor. As with 
2-spinors, we lower on the left and raise on the right, 


UV, = ew", 
PE = Wen, 
In the coordinate basis we have chosen, €,,, has components 
m= ia Se (8.90) 
The spinor-index version of the Clifford algebra relation 8.82 is 
He gh +e gh, = 2g 8, (8.91) 
In a completely analogous manner equation 8.84, we construct self-dual 4-spinor 
Ha B which we write in matrix form as, 
Yuv = WwW = Wp = Yes Ww] (8.92) 


Then, the infinitesimal momentum tensor 8.89 in the 4-spinor representation 
becomes, 


(Myv)” g = iOm) g- (8.93) 


The matrices {7,7",7“”} and their duals, span the Clifford algebra. The dual 
of I is the celebrated matrix 


Pe 


E Emmar Y EYY (8.94) 


pi 
The commutation relations 8.67 insure that 
ye | Se Sn ys (8.95) 


The dual of the six y“” matrices do not yield independent matrices, and the 
dual of 7“ is essentially y“y°, so the algebra is spanned by {I, y”, yt”, “7°, y°}, 
and it has 1+4+6+4+1 = 16 dimensions,. We have defined the gamma 
matrices in a particular matrix representation, but the general relations that 
describe the algebra are basis-independent. 


304 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


8.3 N-P Formalism 


Spinors provide an interesting formalism introduced in 1962 by Newman 
and Penrose [24] for the study of general relativity. Let {M, guv} be a space- 
time with Lorentzian metric of signature (+ — ——). Introduce a null tetrad 
hp = {np lu, -Mpu — Mu}, and dual frame forms, 


0° = h* da, (8.96) 
associated with the frame, 
ea = h” adn, h” a = {le n", m", Mm" }. (8.97) 
In terms of the tetrad, the space-time metric is given by, 
ds? = nap 96", (8.98) 


where 77 is the quasi-orthonormal flat metric, 


01 0 0 
00 0 

nab=1q 9 9 al (8.99) 
0 0 —1 0 


used to manipulate tetrad indices. Here, the metric tensor has the form 
Juv = 21g, ny — yA: (8.100) 


The Cartan structure equations are, 


d 67 +w" AG = 0, Wab = —Wba, 
d wr + wte Aw = Nr, Qab = — Nba, 
d Q, = ON N wp T Wwe N 2%, = 0. (8.101) 


In this formalism the connection components are called the Ricci rotation co- 
efficients, which are defined by, 


VW be = Rohe ch®y. (8.102) 
The Riemann tetrad components satisfy, 

wy = 149%, 

0%, = Rwa? A 04. (8.103) 
Einstein’s equations in tetrad form are, 


Rap — ENa R = Top. (8.104) 


As is well known in the literature, the Riemann tensor admits the decomposi- 
tion, 
Rabed = Cabed + Z gabed + Faded, (8.105) 


8.3. N-P FORMALISM 305 


where, Cabca is the conformal Weyl tensor, and, 


Gabcd = Nac bd — Nad bes 
Fated = Jabs (aS! o), 
Sav = Rav — {NaR. (8.106) 


The transition between tensor and spinor quantities is made by contraction 
with the connecting spinors T” , , which satisfy, 


eo a E 
GuvT ACT Bò = €ABECD: 

The spinor dual one-forms 64? are defined by, 
AB AB a, AB 
06 SF dae = OTe 5 


where the tetrad connecting spinors are chosen such that, 
ds? = det 042, 
0 o? 


We now introduce a spin dyad ¢4, = (ġ4, y4), with ¢įypa = 1. The null 
tetrad can be written as, 


Ly, = Ty papp, My = TP obads, 
Ny = Tu As, Myu = Tp P Abp, (8.107) 


The spin coefficients corresponding the 24 Ricci rotation coefficients are given 
by, 
Taba m Chane i (8.108) 


The 12 complex spin coefficients may be arranged in three groups of four, ac- 
cording to the scheme, 


An ns Poon = {K, p, T, T}, 
By = Voip = {e, Q, B, y} = Vion, 
Cy =TViip = {7,, 1, v}, (8.109) 


The spin connection I Ag, which gives rise to the covariant derivative of spinors, 
is related to the spin coefficients by, 


To e Dipl Geese 


Following equation 8.84, we define, 


Cie E = Ta oyna (8.110) 


306 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


which we use to construct the connection and curvature spinors, 
Tg = woa“ B, 
043 = 0% o4,4B. (8.111) 


We can now write the spinor version of the equations of structure, which not 
surprisingly, have a very similar appearance, 


d0 È 4746 19°F +T a A 04° =0, 
dT4p4+TP4cAToR = Qap, 
dp — Níc AT? sg +146 A 0% 9 = 0. (8.112) 
_ As already discussed, if the coframes 0° undergo a Lorentz transformation 
(= 150", the spinor coframe undergoes a similarity transformation Q x Q € 
SL(2,C) x SL(2C), 
AAB A_pCDAB 
"= QoQ” 5 - 
In matrix notation, the connection and curvature spinors transform according 
to, 
P= QTQ + QdQ, 
Ô = QQE. (8.113) 
The spin connection [4g and the curvature spinor R^ geq satisfy equations 
analogous to 8.103 
iRee — T4 peb. = Pes 
car = TRA Bea Oc A 6t, 


BC? n gPP, (8.114) 


1pAA . |, 

= 3R" BCDD 

We also have a decomposition of curvature spinor into irreducible components 
Ra AapBocpd = VABcDe4BeeD + VApopeABecD 


1 = = zs = 
+ pRleaceBve joe pap + EABECDE AERC 


+ €4B@ op AB ean t+ €CD2aBcDeAB> (8.115) 
where, = 
PaBoDd = ®(4B)(CD) = Oypcp (8.116) 
is the traceless Ricci spinor, and, 
Vascp = V(aBcp) (8.117) 


is the completely symmetric Weyl conformal spinor. Finally, the spinorial ver- 
sion of Einstein’s equation takes the form, 


Papen = 4(LacsptTpeap)) T=T*a=R. (8.118) 


8.3. N-P FORMALISM 307 


Equations 8.118 and 8.101 are called the Newman-Penrose (N-P) equations. 
When written in detail in terms of the 12 spin coefficients, the (N-P) formalism 
results in a formidable set of systems of coupled first order differential equations 
consisting of 4 metric equations, 18 spin coefficient equations, and 8 Bianchi 
identities. Fortunately, the spin coefficients have geometric interpretations that 
motivate imposing conditions on the N-P equations that makes them tractable. 
The Weyl spinor leads an elegant classification of so-called, algebraically special 
space-times. The classification was originally done by Petrov, but it is now com- 
monly known as the Cartan-Petrov-Pirani-Penrose classification. One starts by 
writing the completely symmetric conformal spinor as, 


Wasop = APBYC4bp)- (8.119) 


Each of the rank one spinors is associated with a null vector. The classification 
is as follows, 


1. Type I. Algebraically general - 4 distinct null directions. 
Type II. Two null directions coincide. 
Type D. There are two pairs of null directions that coincide. 


Type III. Three principal directions coincide. 


Qe g fos BRS 


Type N. All principal directions coincide - also called Type Null. 


The 1962 seminal paper by Newman and Penrose [24] is noted by the elegant 
proof in terms of the spin coefficients, of the Goldberg-Sachs theorem. The 
theorem states that a non-flat, vacuum space-time is algebraically special, if, 
and only if, it contains a null geodesic congruence that is shear-free; that is, 
there is a null vector with «x = 0 and o = 0. 

The literature on applications of the N-P formalism is huge. A Google search 
on “Newman-Penrose Spin Coefficients” yields over 54,000 results. We provide 
here a very simple example. 


8.3.1 Example Consider the Vaidya metric in Eddington-Finkelstein coor- 
dinates 6.75. A null tetrad can be adapted to this metric by choosing 


a Feos 1 Ex i a 
w= gr ad E p/2 |30 ` sind ð)’ 

ee: oti [2 í J 
E ðu 2 r ðr? Mp2 |30 sinĝðg|” 


Thus, we have an associated spin dyad as in equation 8.107 
Ly = baop: Nu wavy, My —> abp. 
The only non-zero component of the curvature spinor is 


Pa = Wancp pP UC? = P (8.120) 


r3 


308 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


which is consistent with the space being of Petrov type D. By a clever idea of 
allowing r to assume complex valued, and then performing a complex rotation, 
Newnman and Janis were able to obtain the Kerr metric [25]. 


8.3.1 The Kerr Metric 


When Einstein first introduced the theory of general relativity in 1915, it 
took but two months for Schwarzschild to develop a solution to the vacuum 
field equations. It took another 45 years to find a Ricci-flat solution describing 
an axially symmetric, rotating, space-time. The solution was found by R. Kerr 
in 1963. A space-time is said to be in Kerr-Schild form, if the line element can 
be written as 

Juv = Nv +H laly, (8.121) 


where 7 is the flat metric, H is a scalar function, and l, a null vector with 
respect to g and 7. It is easy to show that the Schwarzschild metric can be 
written in Kerr-Schild form. Start with the Eddington-Finkelstein coordinates, 


ds? = 2drdu + [1 — 2%] du? — r?d6? — r? sin? 0 dg’. 


Since 
(du + dr)? = du? + 2dudr + dr?. 


we can solve for 2du dr and substitute into the metric. We get 


ds? = —2™du? — dr? — r°d0? — r° sin? 6 do? — (du + dr)’. 


Now, we let z? = u +r. The transformation yields 


ds? = (dx)? dr? — 720? — r? sin? 0 de? — 2m (dg? _ dr)”, 


which is the desired Kerr-Schild form with the Minkowski metric written in 
spherical coordinates. In Boyer-Lindquist coordinates the Kerr metric is given 
by (See [21]) 


A in? 0 
ds? = (diasin? 0 do)? Edr? P? do = [(r? +a?) do—ade]?, (8.122) 
where, 
A=r?—2mr +0; a? <m’, 
pP? = r° +a? cos” 0. (8.123) 


Since the metric coefficients do not depend on t and ¢, the quantities 0, = bo 


and 0g = 2 are Killing vector fields, and we get associated conserved quantities 
E and L as in the case of the Schwarzschild space-time. When a = 0 the line 
element immediately reduces to the Schwarzschild metric. When m = 0, the 
cross terms with (dt dọ) cancel out and one gets 


r? +a? cos? > 


ds? = dt? 
r2 + a? 


(r? + a? cos?) dé? — (r? + a?) d¢?. 


8.3. N-P FORMALISM 309 


This one is not obvious, but it is the flat Minkowski metric in oblate spheroidal 
coordintates, that is, one for which the spatial part is the R? metric based on 


ellipsoids 
z2 y? 52 
| 


T 
r2 +4 a2 r2 + a2 r2 


=1, (8.124) 


parametrized by the transformation 


z = yr? +a? cosọsinð, 
y = yr? +a? sin ġsinð, 
z=rcosé. 


The metric blows up at p? = 0 and A = 0. When p? = 0 we have 


T 
r=0, 0=-—. 
2 

This constitutes a real singularity of the curvature. On the other hand, it 
can be shown that the singularity A = 0 can be be removed by a change 
of coordinates, so this singularity is more like the apparent singularity of the 
Schwarzschild metric at r = 2m. The quadratic equation A = 0 has solutions 


re=mtvVr?-a?. 


Thus the space-time is divided into three regions, 


Ry: where r4} <r, 
Rə: where r_<r<ry, 


R3: where r<r_, 


We live in region R,. The boundaries at r+ represent an outer and an inner 
event horizon respectively. A time-like particle can cross from region Rı to Rə 
and from R to Rs, but not the other way around. As such, r} is the real 
event horizon. A new feature that is not present in the Schwarzschild metric 
is a region called the ergosphere which lies outside the event horizon in inside 
the oblate region defined by goo = 0. Particles entering the ergosphere from 
region Rı are subjected to frame-dragging by the rotation of the black hole. By 
inspection of the metric 8.122, we see that, 


A a@sin?@ 


go = =- —> > 
p p 
_ 7? — 2rm + a? cos? 0 
= 7 2 


Thus, the outer boundary of the ergosphere is given by the root 


rt = m+ Vm? — a? cos? 0, (8.125) 


E E 


310 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


Event horizon Ergosphere 


Fig. 8.4: Ergosphere 


The rotational frame-dragging is caused by the change at this boundary of the 
Killing vector field 0, from being time-like to space-like. In Boyer-Lindquist 
coordinates it is relatively easy to compute the connection forms. We choose 
an orthonormal coframe 


A 
0? = VA at asin? 6 dọ), 
p 
o = L- dr, 
Ja” 
6? = p dé, 
sin 0 


——|(r* + a” )d@ — a dt), (8.126) 


so that 


ds? = (0°)? (9)? (67)? (0°). 


The idea is the same as in previous connection computations, noting that the 
connection forms are antisymmetric. We take the exterior derivatives of the 
coframe forms and express the results in terms of the basis. We read the 
connection coefficients from the first Cartan structure equation, and check for 
consistency for possible missing terms. The presence of cross terms makes the 
calculation a bit more challenging and requires some finesse. We compute the 
quantities we need as we go along, starting with, 


p dp =r dr — a? cos 0 sin 0 dé, 
dA = 2(r — m) dr. 


8.3. N-P FORMALISM 311 


The order of the computation is not really important, so we might as well begin 
with 6! and 67, that are the easiest. We have 


d0’ =d() Adr, 


= YAdp—pavA yp dp, 
= 4A Cfar- a? cos @sin@d6)] A dr, 


62 
= ——~ a? cos 4 sin 0 — A Ag, 
p 


pVA 
> com ot a 9°. 
p 
Continuing, 
d0? = dp Ado, 
1 
= =r dr ^ d9, 
p 
= rvVA on Loa 
p p p 
= ye AB. 
p 


Comparing these differentials with the structure equations 
d0 = —wt; AG, dO? =—-w?;A0, 


we infer that 
a? cos @ sin 0 gl rJA 
p? p? 


However, we should not be surprised if the expressions above for d0! and d0? 
have other terms that either add to zero or wedge to zero. The other two 
structure equations are more elaborate. We have 


d0? = d(“4) A (dt — asin? 0 do) + YA (—2asin 8 cos 6 d0 A d), 


— pdVA-VAdp p_g0 _ 2aVA q; 
22 NTRP = sin 0 cos 0 d9 A dọ, 


< 1 (r—m) VA 2 ; p_ Q0 2aVvA |: 
z lo TK dr 73 (rdr — a° cos 0 sin 0)] A vx? — Sb sin 6 cos 0 dé A dọ, 
1 = IK 1 IK 
= 5 ar | dr A Fe 0° z (a? cos 8 sin 8) 6? A 0° — 2aV4 sin cos0 dô A dẹ, 
p p 
ey 1 IK 
=([f oo rA} 1 A 0° 4 A (a? cos 0 sin 0) 67 A 0° — Za VA sin cos dð A dg. 


For the last term in the right, we will need to express dọ in terms of the coframe. 
This is easily done by eliminating dt and solving for dọ from the equations for 


312 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


dt = Fe 6° + asin? 6 dọ, 
8 = 2 a”) do ee a? sin? 6 dø], 
= sine fp? dọ — BP. KO", 


2 oe a 0 
do = -4g + mo 


Substituting for d@ into the last equation for d0°, we get after some simplifica- 
tion 


p(r—m)—rA 
PVA 


2 s 
do? = ong? a L 


0? n0? =r V Koos 6 (076°). (8.127) 


We proceed to compute d0? in a similar manner. 
inĝ 
de = ge NCEA) do — Gd ee op arn as, 
p p 


z Ae ae p 634 ore oe: 
p P 
2r sin 


1 
= —— [pcos 6 d0 — sin 0 dp] A 0? + ——— dr A dọ. 
psin@ p 


sin 0 


The rest of the computation is completely straight-forward. We substitute the 
differentials d0, dp, dr and dọ in terms of the coframe, and simplify. Notice that 
we do no need to solve for dt. We leave it to the reader to verify the result, 


do? = 2ar sin 0 2 A 0 + BA ji ree cosh , > 


P P wng" a’) 6? mx 6°. (8.128) 


Anticipating possible missing terms required for consistency, we split the 6? A03 
in the equation 8.127 for d0? as 


_ acosé BORE 


2a cos a KS BAO 4 


—— A 0", 
p? p’ 


8.3. N-P FORMALISM 313 


and do the same for the 6! A 0° in the expression 8.128. Together with 8.3.1, 
we can now read all the independent connection forms 


0 _ 1p2(r—-m)-rA,,0  arsind 23 
W 1 = [2 PVA ]8 P 0 ; 
es (a? cos 0 sin 8) 9 4 acos6VA 93 
p p3 
0 acos6VA 92 ar sing y 
W 3 3 3 ’ 
p p 
ge A DA 6, 
p p 
gue a NE 
p p 
0 0V A 
w3 = nee r? + a”) 9° mat VA p (8.129) 
œ sind p? 


The only term above that could not be read immediately from the computed 
differentials is the second term in w?3. Here, the 6° term comes from a modifi- 
cation of the formula for dé? that is required for consistency with wo = —w®9. 
The computation of the curvature form requires no finesse; it is just a lengthy 
“plug and chug” calculation that presently is not a task for human beings. 
With the use of a computer algebra system such as Maple or Mathematica, 
one can verify that the curvature is Ricci-flat. For a full discussion of physical 


implications of the Kerr geometry, see Misner, Thorne and Wheeler [21]. 


8.3.2 Eth Operator 


One of the most pulchritudinous results arising from the N-P formalism, 
was the serendipitous discovery by Newman and Penrose of a characterization 
of asymptotically half-flat space-times in terms of an operator called eth. The 
operator acts on a space consisting of spin-weighted functions on a sphere. A 
function 7 on a sphere has spin weight s if it transforms as, 


Sey, (8.130) 


under a rotation about the north pole. The spin-weight is constrained to be 
a 4-integer. Here we are only interested in the nature of the operator in its 
relation to representations, but to provide some historical context, we say a few 
words about half-flat space times. The simplest way to introduce the notion of 
half-flat is to consider the good cut differential equation, 


PLC OZ, C= 0(F,6,0), (8.131) 
where the eth operator is defined by, 


ön = OP? (Pon). (8.132) 


314 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


Here, P is the conformal factor in the Fubini-Study metric 5.63, 
P=114+0), 


and 0°(Z,¢,¢) is some complex valued function (which in context represents 
the asymptotic value of the shear spin index a). The idea in the N-P formalism 
is that if one could solve this equation, then one could construct an (asymptot- 
ically) half-flat space-time. The definition of a half-flat space-time starts with 
space-time analytically continued into the complex. Then the components of 
the spinor image of the Weyl tensor, 


Cabca > Vascnet 4 peep + Ü i BGDEABECD (8.133) 


are now independent; here indicated by replacing V by WU. The two components 
are the self-dual, and the anti-self-dual parts of the spinor. The space is half- 
flat, if it is Ricci flat, and, 

W igen = 0 (8.134) 


The reason a sphere 9? enters into the picture can be motivated by a simple ge- 
ometrical argument. If space time were spherical, then null rays would converge 
to a single point at infinity. Conformal null infinity in this case can be viewed as 
the intersection of a hyperplane with the 4-sphere at the north pole. However, 
if the space is Lorentzian and asymptotically flat, then conformal infinity looks 
like a hyperplane intersecting with a hyperbolic surface, which is a cone with 
topology S? x R. 

In spherical coordinates, the eth operator acting on a function 7 of spin weight, 
takes the tantalizing form 


T2 i oð 
ön = —(sin 0) E + and =| (sin@)~*n, 
= sata WO i O eiea 
ðn = —(sin 0) E — gl (sin 0)°n. (8.135) 


We have, 


(ön) = en, (8.136) 


so these act as raising and lowering operators of spin weight. One can also 
verify that, 
(00 — 00)n = 2sn. (8.137) 


The eigenfunctions of ðn = 677 = 0 are called spin-weighted spherical harmon- 
ics and are denoted by sYim(0,), where |s| < l. Some authors have pointed 
out that these entities were previously known to Gelfand. In the case s = 0, the 
operator ð? is just the Laplacian, so the eigenfunctions are spherical harmonics. 
Since ð and 6 raise and lower the spin weight respectively, the spin-weighted 
spherical harmonics can be obtained by successive applications of the operators 


8.4. SU(3) 315 


to spherical harmonics. The elegant formulas were derived by Goldberg, et-al 
[10]. 
Ə? Yim = — (l — 5) (l+ $ + 1) Yim, 
ð sYim = Vil J s)(l +s+ 1) s+1 Yim, 
Ö sYim = —y/(l $ s)(l =S 1) st+1Yim- 


Applying these equations iteratively, one can show that 


Cd BV ien if0<s<l, 


GED (1) “Yim if-1<s <0 


In the context of representation theory of the rotation group, the main result 
is that Wigner D'mm matrix elements can be neatly expressed after a messy 
computation, by the neat formula [10], 


4T 


Y; isọ 1 
ana? im(O, b)e (8.139) 


D'_ms(o, 9, =v) = (-1)™ 


In 1985, T. Dray [7] proved that with a appropriate choice of spin gauge, spin 
weighted spherical harmonics were the same as the monopole harmonics intro- 
duced by Wu and Yang as solutions of a semiclassical electron in the field of a 
Dirac monopole. This is not surprising since, as we will see in the next chapter, 
the Dirac monopole is associated with a connection on a U(1) Hopf bundle over 
S?, whereas the transformation law for spin- weighted function is basically a 
gauge transformation in such a bundle. 


8.4 SU(3) 


The SU(3) group was in introduced by Gell-Mann in 1961, as a candidate 
for asymmetry gauge group to accommodate quark “flavors”. In the language a 
particle physics in this theory, Hadrons are made up of three quarks with flavors 
called: up, down and strange (u,d,s), at a time when particles with “color” 
attributes of charm, top, or bottom (c,t,b) were unknown. If one denotes a 
flavor state by, 


u 


lve) = |d], 


S 


The isospin action by g € SU(3) is simply given by |f} + gļyf). The Lie 
algebra su(3) consists of 3 x 3 traceless, Hermitian matrices. The dimension 
of the special unitary group SU(n) is n? — 1, so su(3) has 8 generators. Gell- 
Mann chose for these generators, the closest extension of Pauli matrices. The 


316 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


8 Gell-Mann matrices are, (see Georgi [9]) 


0 1 0 0 -2 0 0 -2 0 
AL = 1 0 0l, A2= i 0 0}, A3= a 0 0j, 
000 0 0 0 0 0 0 
0 0 1 0 0 -i 
=l0 0 Ol, à=]l0 0 o0], 
1 0 0 i 0 0 
000 0 0 0 1 1 0 0 
Aàs=l0 0 1], Ay=]0 0 —il, à\=— l0 1 Of, (8.140) 
010 0 O O v3 0 0 —-2 


Anticipating a factor of 2i as in the case of Pauli matrices, we use the standard 
convention of denoting the structure constants by, 


[Aj An] = ft ja, Ai- (8.141) 
Effectively, the matrices are normalized so that, 
Tr(AjAx) = Ook 

It is customary to set 

Tj = Aj 
The structure constants turn out to be completely antisymmetric, so, modulo 
permutations, there are only 8 independent ones. The upper 2 x 2 block of 
{\1, A2,A3} are just the Pauli matrices, so this set constitutes an su(2) subal- 
gebra, and we have l 

Í ik = jk 
whenever all indices are less than or equal to 3. The isotopic spin SU (2) algebra 
generated by the exponential map of these generators, result in rotations of the 
flavor state |Yp) that leave s invariant. Defining, 


TA = V3A8 + A3, 
T- = V3As = A3, 


one can also identify two more su(2) subalgebras generated by {X4, 5,74} and 
{A6, 7, T- } respectively. The rest of the non-zero structure constants are easily 
computed using symbolic manipulation software. The results are, 


fiat = fass = fos7 = fosas = F, 
fise = J367 = — 4, 
fass = fes7 = 3V3. 


A concise formula for the structure constants is given by, 


figk = =å Tr(AilAz, Av) (8.142) 


8.4. SU(3) 317 


Another useful fact is the formula for the anti-commutators, 


{Aj, An} = $5541 + 2d’ jk Ak; where, 
dijk = ; Tri {A;, Ak}. 


The Killing form, 

Ijk = FT gent ri = 3ðjk, 
modulo the usual problematic factor of 2i, is non-degenerate and positive defi- 
nite, as expected from a semi-simple, compact group. The Cartan subalgebra is 
spanned by h3 = T; and hg = Tg which commute with all the other generators. 
Thus, the subalgebra is of rank 2 and there are 2 Casimir operators. Since the 
generators of the Cartan subalgebra are already in diagonal form, is is easy to 
find the eigenvectors and corresponding weights [9], 


Beez [2] > Chae) [8 > @-¥). 


To find the roots we need generators that take one weight to another. From 


Fig. 8.5: Root Diagram A> = GU(3) 


lessons learned from su(2) we take as raising and lowering operators, 


1 1 1 
— (Ti + iTo), —-(T4 +175), (Te + iTr). 
Ja! 1 2) TaS 4 5) Va! 6 7) 
A straight-forward computation of the commutators with the generators of the 
Cartan subalgebra gives, 


[T3, (Tı = iT2)| = £((T1 + iT), 
(Ts, (Ti ae iT>)| = 0, 


So, we have found two roots, (+1,0). Continuing the computation for the next 
set of ladder operators, we get, 


(Ts, (Ts + iT5)] = +5 ((T + i75), 
(Ts, (T4 + iTs)] = Bn, + iTs), 


318 CHAPTER 8. CLASSICAL GROUPS IN PHYSICS 


so the next pair of roots are ( , Z 


set of ladder operators, leads to, 


). Finally, the commutators for the third 


[T3, (Te oe iTe)| = FIT E iT7), 
IT, (Ts + iTo)] = BETA + iTs), 


F5» +), The complexification of su(3) is sl(3, C). 
In Cartan’s classification of semisimple Lie algebras, the root system for sI(n + 
1,C) is called Ay. 

In the root diagram Az = su(3), the roots form a regular hexagon with two 
roots at the center, as shown in figure 8.5. We refer the reader to the famous 
book by Georgi [9] for a full discussion of representations of groups in Physics. 


giving the last pair of roots (F £ 


Chapter 9 


Bundles and Applications 


9.1 Fiber Bundles 


The late 1970’s was an exciting time to be a graduate student at Berkeley. 
At the time, the University had a powerhouse of some of the top, world-class 
mathematicians in differential geometry and related fields, including, Chern, 
Kobayashi, Wolf, Gilkey and Weinstein; a number of renowned general rel- 
ativity researchers such as Taub, Marsden, and Sachs; as well as a battery 
of visiting faculty and invited speakers at the frontiers of research. Prior to 
1975, with the exception of Professor Sachs, who had a dual appointment in 
the physics department, I don’t think I ever saw, either as an undergradu- 
ate or as graduate student, a physics professor enter the math building or a 
math professor walk the hallways of the physics building. It just so happened, 
that on 1975, Belavin, Polyakov, Schwartz, and Tyupkin, published a paper on 
pseudoparticle solutions to the Yang-Mills equations [3]. This so-called BPST 
instanton, drew widespread attention in the physics community. The instantons 
are extremals under a variational principle of the Yang-Mills Lagrangian, which 
generalizes the electromagnetic Lagrangian 2.123, to non-Abelian Lie algebras. 
The paper included a provocative discussion of topological properties such as 
homotopy classes and a footnote referring to a particular equation as a Pontr- 
jagin class. A. Trautman [37] is credited as the first mathematician to observe 
that the BPST instanton (and Dirac’s monopole) corresponded to a connec- 
tion on a Hopf bundle. The details will be presented in this chapter. In 1977, 
Schwarz (apparently the correct spelling) used the Atiyah-Singer index theorem 
to show that the number of instantons and zero fermion modes is given by some 
topological invariant (8k — 3) [32]. Perhaps the inclusion of such heavy-duty 
mathematics made some of the particle physicists a bit uncomfortable. I say 
this because in 1977, when I. M. Singer was offered a position at Berkeley, his 
seminars on the Penrose twistor programme and gauge theory got flooded with 
non-Abelian gauge physicists including Mandelstam. The relevance of twistor 
theory to gauge fields was first established by R. Ward in a brilliant short paper 
[39] in which he showed that certain complex vector bundles related to CP®, 
in twistor theory, could be used to generate self-dual gauge fields. This also 


319 


320 CHAPTER 9. BUNDLES AND APPLICATIONS 


drew the attention of algebraic geometers such as R. Hartshorne, working on 
moduli spaces of vector bundles, The presence of Singer at Berkeley attracted 
a slew of prominent visitors such as M. Atiyah, S. Yau, and later, A. Lich- 
nerowicks. Atiyah and Singer became major contributors to the mathematical 
formulation of Yang-Mills Theory; in particular, in a paper published in 1978, 
[2], Atiyah, Hitchin and Singer computed the dimension of the moduli space 
of irreducible, self-dual connections for Yang-Mills equations in 4-space for all 
compact gauge groups. The dimension of this space for SU(2) is the topological 
invariant (8k — 3) derived by Schwarz. 

Among the physicists attending Singer’s lectures was a young researcher 
named A. Hanson with whom I partnered to become the note-takers for the 
seminar series. The section on Yang-Mills in these notes, is partly distilled from 
my 1977-78 notes with Hanson on the lectures by Singer. A couple of years later, 
Hanson, along with T. Eguchi and P. Gilkey, who introduced me to algebraic 
topology, published a comprehensive work, in which among other things, they 
announced the discovery of a new metric for a gravitational instanton [8] 

We have already encountered several examples of fiber bundles of interest in 
physics. The most fundamental of these are the tangent and cotangent bundles 
and tensor product thereof. We have also discussed the Hopf bundle which is 
endowed with a non-trivial topology. We present now a more formal approach 
to fiber bundles, still keeping in mind that to try to make the material more 
accessible to physicists, we occasionally might sacrifice a bit of rigor in favor of 
simplicity. 


9.1.1 Definition A smooth fiber bundle is a set € = {E, M, m, F} consisting 
of manifolds E, M and F andaC™ mapa: E —> M from E onto M satisfying 
the following properties: 


1. The map is locally trivial. This means that for every point p € M, there 
exists an open set U containing p, together with a diffeomorphism ¢ : 
n-\(U) + U x F. We call Fp = 7~'({p}) the fiber at the point p. We 
allow the simpler notation F, = 7~'(p) for the fiber at p and 


Apsce (U) = |] F, 
pEU 


for the fiber space over the set U. This part of the definition says that 
locally, the bundle looks like a cross-product; that is, t~1(U) = U x F. 
The pair {U, ¢} is called a coordinate neighborhood or a coordinate patch, 
or a local trivialization of the bundle. 


2. The fiber space is glued in a smooth manner. More specifically, if {U;, pi} 
and {U}, ¢;} are two coordinate neighborhoods with nonempty intersec- 
tion and p € U; (] U}, then 


Qij = ¢7 Q; : U; x F > U; x F 


is a diffeomorphism. The quantities ¢;; are called transition functions. 
The sets {U;} constitute an open cover of M. (See figure 9.1). 


9.1. FIBER BUNDLES 321 


m~!(U;) $i U; x F 


Pri 


Fig. 9.1: Fiber Bundle 


The manifold M is called the base space, Æ is called the total space or the 
bundle space, and 7 the projection map. The notation 


F>E5M (9.1) 


or simply 
t:E >M 


is also used, probably because it is easier to typeset. There is a common abuse 
of language in calling E the bundle space, since the bundle is really the set £. 
A trivial bundle is one that is a simple cross product, Æ = M x F. In that 
sense, we could view R? = R x R as a bundle with base space M = R and 
fiber F = R. There would be no advantage to view R? this way, but it is a 
bundle anyway. The total space is the union of all the fibers, which locally does 
look like a cross product, but globally it might have a non-trivial topology, as 
in the case of the Hopf bundle 8.1.4. It is possible to start the treatment of 
fiber bundles with topological spaces, in which case, the projection map would 
be a continuous function, the coordinate maps homeomorphisms, and the base 
space would have the quotient topology. These topological bundles can then be 
given differentiable structures as above. In fact, it would be more natural to 
treat this subject in terms of categories, morphisms, and functors, but we will 
resist the temptation. 


9.1.2 Example Let M = St, I =[-1,1] and E = St x I. This trivial bundle 


is just a cylinder with a circular base space and each fiber a copy of the interval 
[—1,1]. Topologically, we can construct the bundle by looking at the base space 


Co IT L |] 


Cylinder Möbius Band 


Fig. 9.2: Möbius Band 


as an interval, say [—7, 7] with the end points identified. We glue the bundle by 


322 CHAPTER 9. BUNDLES AND APPLICATIONS 


identifying the fibers at the endpoints, as suggested by the arrows pointing in 
the same direction in the picture in the left, on figure 9.2. On the other hand, if 
we identify the vertical edges in opposite direction as shown in rectangle on the 
right, we get a strip with a twist, which is topologically equivalent to a Möbius 
band. A smooth parametrization of this surface and a picture rendering the 
surface is given in 5.1. 


9.1.3 Definition A section s of a bundle € = {E, M,7z, F} is a smooth map 
s: M —> E such that 
tos =idm 


This is the same concept introduced in the context of the tangent bundle for 
which the sections are called vector fields (See figure 1.1). We use the same 
notation I(E) for the set of all smooth sections. For the tangent bundle, the 
fibers are copies of R”, so they are vector spaces of dimension n. In such a 
case we call the bundle a vector bundle. The tangent bundle TR” is a trivial 
bundle. If p € U C R”, then r™tU S U x R” and the coordinate patch maps 


are, 


ð ð 
Q: (ra era) œ (p,at,..., a”). 


This slightly more formal point of view is consistent with our earlier definition 
of a tangent vector. A vector bundle has more structure that a run-of-the- 
mill fiber bundle. If (p, f) is an point in the bundle in the intersection of two 
coordinate patches {U;, ¢;} and {U}, j}, the transition functions satisfy, 


Qij = (p, pij (p) f), 


where y;;(p) € GL(n, R)! gives a linear isomorphism on the fibers. For the 
tangent bundle, this is yet another way of saying that if we change coordinates 
near p, the components of a tangent vector at p change by an action of an 
element of GL(n, R) represented by the Jacobian. If the fibers are k-dimensional 
with a GL(k, R) action on the fibers, we just say that the vector bundle is k- 
dimensional, even though the real total dimension is (n + k). A vector bundle 
of dimension 1 is called a line bundle. The normal bundle of the sphere 9° in 
R? would be an example of a line bundle. Since the fibers of vector bundles are 
vector spaces and every vector space has a 0 vector, there is a special section 
s such that s(p) = (p, 0); this is called the zero section. This is a trivial global 
section in all vector bundles. The set of all sections of a vector bundle has 
a natural structure of a vector space, in which the zero-section is the zero- 
vector. There is no problem finding non-singular global sections of the tangent 
bundle T(R”). However, for submanifolds M of R”, there might be topological 
obstructions to the existence of vector fields that are non-zero everywhere. For 
example, the reader may be acquainted with the theorem that one can not 
“comb” a hairy sphere S?. As proved by Poincaré, the obstruction in this 
case is the Euler characteristic which would need to vanish, but for the sphere, 
x(S?) =2. 


l Again, we adopt this notation with reluctance, although is common in the literature. The 
quantities p;j(p) are matrix-valued, so they really should be written as ° ; (p) 


9.1. FIBER BUNDLES 323 


9.1.4 Definition A covering space of a space M is a bundle £ = {E, M, nr, F} 
in which every point p € M, has a neighborhood U C M, such that 7~!(U), 
is the disjoint union of sets Vk, each homeomorphic to U. In other words, the 
fibers of U are discrete. Here, m is a local homeomorphism and M has the 
quotient topology. 


9.1.5 Example Most likely, the first examples of covering spaces students 
encounter in a course in algebraic topology are associated with the circle 51. 
The map 7: St — S! given by 


r(e) =,  neZr 


gives an n-sheet covering of S' for each positive integer n. One can envision 
this covering as a rubber band folded into n loops around a cylinder. The 
continuous homomorphism 7 : R — S! given by 


a(t) =e 


is a covering of S! by the real line. A covering that is simply connected as it 
is the case here, is called a universal covering space. This map is the starting 
point in establishing that the fundamental group of S1, also called the first 
homotopy group 71, is given by 


m7 (S") ~Z 


A differentiable manifold structure is not required in this example. All that is 
needed is that S! = R/Z is a topological group, R is simply connected, and 
Z is a discrete subgroup of R. The homotopy equivalence classes are the loops 
that have the same winding number. The additive group structure is given 
adding the number of loops of two group elements. 


9.1.6 Example The projective space RP” of lines in R”*! 
is defined by the quotient space of S” obtained by identifying 
antipodal points; that is, the two antipodal points of inter- Lf 
section of the sphere with a line through the origin. The 
covering space of RP” is the real version of the Hopf bundle 


t: S” + RP”, 


with fiber Z2. The group covering transformations consist of the identity and 
the antipodal map. In this example 


Ti (RP”) = Zo. 


I have to thank professor P. Gilkey for motivating me to overcome the fear of 
algebraic topology machinery, with his wonderful illustration of the above, for 
the case n = 3. He showed up the first day of classes with a toy consisting of 
two equilateral triangles, one large, one small. The triangles were connected at 
corresponding vertices with three separate untangled strings. He set the large 


324 CHAPTER 9. BUNDLES AND APPLICATIONS 


triangle on the table and flipped the small triangle in the air by half revolution. 
He asked for a volunteer in a class of 5 students to untangle the strings only 
allowing parallel translations of the small triangle. Clearly, it was not doable. 
He reset the toy and then flipped the small triangle a full revolution. I got 
lucky and untangled the strings almost immediately. He proceeded to illustrate 
by combinations of half-turns and full-turns, until it was almost self-evident, 
that there were only two possible outcomes. Of course, the key question of the 
day was, how does one prove that? He explained that the deformations of the 
shape of the strings by parallel motion were examples of homotopies; that a 
rotation in space left two antipodal points on a sphere fixed; defined projective 
space; and concluded that what we had here was a manifestation that the first 
homotopy group of the projective space had two generators. Four weeks later 
we had learned enough tools to prove the assertion. 


9.1.7 Definition Let £ = {F,M,7,F} and Ẹ = {E’, M’,n’, F’} be smooth 
vector bundles. A vector bundle map is a pair of smooth maps fm : M > M’ 
and f : E + E’, such that 


1. The diagram 9.3 commutes, , 


f 


E —-> E' 


ke 


M —™, M' 
Fig. 9.3: Bundle Map: fmor=r of. 


2. The map induced by f on the fibers is a linear map. 


The meaning of this commuting diagram is that fibers are mapped to fibers. 
Indeed, if F, = m~! (p) is a fiber at p € M, then, 


(T o f)(Fp) = (fm ° T)(Fp), 
= (fm om)(x ‘(p)), 
= fm (p), 


so that a point in the fiber of p € U C M lands on a point in the fiber of 
fm(p). Thus if makes sense to say that the map induced on the fibers needs 
to be a linear map of vector spaces. More specifically, if (p,v) € E, then 
f(p,v) = (fu(p),T(p)-v), where T : U +e L(F, F’) gives a linear map T (p) 
from the fiber Fp to the fiber Fou (p): If these linear maps are vector space 
isomorphisms and fm is a diffeomorphism, the bundle map is called a vector 
bundle isomorphism. In this case, the two bundles are essentially the same. 


9.1.8 Definition Pull-back bundle 


9.2. PRINCIPAL FIBER BUNDLES 325 


Let r : E + M be a vector bundle with fiber F and let f : M’ > M bea 
smooth map. We can define a pull-back bundle denoted by f*m : f*E => M’ 
by assigning the fiber Fy to each point p' € M’, the corresponding fiber F(p) 
at p = f(p’). More precisely, if v € E, then 


PE={(pv)eM x E| f(p')=7v)}, frp, v) = p (9.2) 


It clear that f*E C M’ x E is locally trivial, as it should be. If {U;, i} is a 
cover of 7 : E —> M by coordinate patches, then {f71(U;)} is a cover of M’ 
with transition functions f*ġi;(p') = dij (p). if, 


mM’ SM ÊM 


then, 
(Fog) E S f*(g“E) 

This should elicit memories of the properties of the pull-back of differential 
forms 2.68. 
One of the main application of pull-back bundles is the interesting relation to 
homotopy. Recall from definition 7.1.16, that two maps f,g : M’ > M are 
called homotopic if there exist a map ¢: M’ x [0,1] > M, such that 

a) o(p’,0) = f(p), 

b) o(p', 1) = g(p). 
The main theorem in this regard is that if f and g are homotopic, then the 
pull-back bundles f*E = g*E are isomorphic. 


9.1.9 Corollary Let M’ = M = be a contractible space and suppose the 
homotopy is a deformation retract (see 7.1.17), 


$1(p) = O(p, t) = tp. 


Then the theorem says that E = f*E = g*E = M x F, which proves that if M 
is contractible, then F is trivial. 


9.2 Principal Fiber Bundles 


A smooth principal fiber bundle (PBF) is essentially a fiber bundle in which 
the fibers are Lie groups, or manifolds on which there is a free and transitive 
action by a Lie group . In that sense, one can view the base space as the 
parameter space for a family of fibers F where at any point p in the base 
space, the fiber Fp is diffeomorphic to a Lie group. More formally, we have the 
following definition 


9.2.1 Definition Principal fiber bundle 

A smooth principal fiber bundle is a set € = {E, M, r, F,G}, where F > E 5 

M is a fiber bundle, and G is a Lie group acting on F freely along the fibers. 
As a reminder, G acts on F on the right, if there exists a smooth map 

u: Ex G —> E given by (b,g) +> bg. The action is free along the fibers if the 


326 CHAPTER 9. BUNDLES AND APPLICATIONS 


only element that acts as an identity is the identity and the action is transitive 
if any two points on the same fiber are connected by an element of the group. 
Thus, if the action is free and transitive, the fibers are diffeomorphic to the 
orbits of the group. This means that the bundles 

tT} pr} 

M 5 E/G 


are isomorphic. Here, pr is the natural projection of E onto its cosets. 


9.2.2 Example Bundle of Frames 


B(M) R” x Gl(n, R) 


R” 


Fig. 9.4: Bundle of Frames 


One of the most important principal fiber bundles is the bundle of frames. 
We hope that the following discussion does not obscure simplicity for the sake 
of formalism. The bundle of frames is the structure one gets by attaching 
to each point on a manifold, the space of all possible frame fields of tangent 
vectors. The structure group is Gl(n, R), which acts on the fibers at each point 
on the manifold, by matrix multiplication that changes one frame into another. 
Parallel transport of vectors and frames on the manifold, correspond to choosing 
a horizontal subspace of the tangent space of the bundle. Choosing a horizontal 
subspace in the bundle is then equivalent to choosing a connection. 


9.2.3 Definition Let M be a smooth manifold of dimension n, and let 
{(¢;,U;)} be an atlas of coordinate charts covering M. As usual, we label the 
coordinates of p € U; as (x',..., x”). The frame bundle B(M) is defined as the 
bundle 7: B(M) > M, where 


B(M) = {(p, €1, €2,.--,€n) : p E M, 6 = (e1, €2,...) is a basis for T,(M).} 


The projection map 7 is given by 


9.2. PRINCIPAL FIBER BUNDLES 327 


Let (¢,U) be a coordinate chart with coordinates (x!,...,2"), with U = 7~1U. 
As shown in figure 9.4, we can lift the chart in M 


ġ:U >R” 
to a chart in B(M), 


$: U > R” x Gl(n, R) ~ RT, 


as follows. Let {0; = 327} be the standard basis for the tangent space TpM, 
associated with the coordinates p = (z',...,2"). Let b € B(M) be a point 
with coordinates 


b = (p,e€1,---,€n) 
on the fiber Fp. Then, as shown in equation 3.1, there exists a matrix A € 
Gl(n, R), with l 
€i = A?;ðj 
Matrix multiplication by A is on the right, but, as in 3.1, we choose to write 


the equation as above, to make it clear that 0; is not acting as a differential 
operator on the components of A. Thus, b € Fp can be written as 


; O 
b = (p, A‘; ——). 
(p, j Fal? 
The bundle coordinate patch on U is defined by 
$, (p, Elysees en) = (p, A’;) 


Here, we identify the matrix A € Gl(n,R) with a point in R” , where the 
coordinates in R”? are given by the entries in the column vectors of A. The 
standard coordinates of Gl(n, R) are given by the matrices Xt; that have a 1 
entry in the i” row, jt” column, and a 0 entry everywhere else. 

The right action u : B(M) x Gl(n,R) > B(M) along the fibers is given by 


H: (b,9) > bg = (p, fi = ej 9's); g € Gl(n,R). 
Given two overlapping charts in B(M) 
Qi p a *(U;) => U; x Gl(n, R), 
$j: nm '(U;) + U; x Gl(n, R). 
with transition functions ¢,, on U; N U}, given by ¢,; = ONA The group 
action on the fibers satisfies, 


(bg1)g2 = b(g192). 


The atlas (¢;,U;) thus gives B(M) the structure of a differentiable manifold. 
A local section of the bundle s, € T'(B(m))) represents a smooth choice of a 
family of frames at points in U, with 70s, = id. 


328 CHAPTER 9. BUNDLES AND APPLICATIONS 


9.2.4 Example Let G be Lie group and H C G a compact subgroup. Then 
a : G — G/H, where v is the projection map onto the orbit space, is an H- 
bundle. This is one of the early results in the theory of fiber bundles, first 
proved by H. Samelson in 1941. 


9.2.5 Example The projective space fibration 

Recall that the complex projective space CP” is the quotient space C”/ ~, 
where a,b € C” are equivalent, a ~ b, if there exists a c € S1 such that a = be. 
That is, CP” is the space of lines through the origin. Then. 


gent 5, ops 


is a principal U(1) bundle. The special case for n = 1 is the ubiquitous Hopf 
map 8.32. 


9.2.6 Example The SO(n) bundle 
The group of special orthogonal matrices 


SO(n) = {A € Mnxn(R) : A7! = AT, det A = 1} 


acts transitively on the unit sphere S"~! C R”. The subgroup that leaves the 
north pole fixed, that is, the isotropy subgroup of the point e; = (1,0,...,0), 
is the set of matrices of the form 


1 0 
al Bl Be SO(n—1). 
Thus, SO(n)/SO(n — 1) is diffeomorphic to $"~!. This, with the projection 
map 


SO(n—1) 
ss 


T: SO(n) cent 


constitutes a SO(n — 1) bundle. 


9.2.7 Example The U(n) bundle 
The group of unitary matrices 


U(n) = {A © Maxn(C) : A! = At} 


acts transitively on the unit sphere $?”-! C C”. The subgroup that leaves the 
north pole fixed, that is, the isotropy subgroup of e; = (1,0,0,...,0), is the set 
of matrices of the form, 


azl ar BeU(n-1). 


Thus, U(n)/U(n— 1) is diffeomorphic to $?"~1. This, with the projection map 


U(n—-1) 
— 


a: U(n) genad 


9.2. PRINCIPAL FIBER BUNDLES 329 


constitutes a U(n — 1) bundle. The bundle structure above is also true if 
the unitary groups is replaced by special unitary groups, which corresponds 
to requiring the matrices A to have det A = 1, or equivalently, to picking an 
orientation of the frames. 


9.2.8 Example The Grassmannian 

Let F denote, either of the vector spaces R or C. The space of orthonormal 
k-frames in F” is called the Stiefel manifold V,(F”). We can characterize the 
Stiefel manifolds as the set of n x k matrices, 


Vp (F”) = {A € Mnxk : AAT = Ik}, (9.3) 


where Iķ is the k x k identity matrix. We interpret J, as a matrix Ik = 
e1, e2, . . . , x] of orthonormal basis vectors. Viewed as representing linear trans- 
formations L(F*, F”), the matrices A have rank k. 

Analogous to the construction of projective spaces, we say that two matrices 
A, B,€ V,(F”) are equivalent, A ~ B, if there exists a k x k orthogonal (or 
unitary in the complex case) matrix C such that A = BC. The Grassmannian 
is defined as 


Gr(k, E”) =V,(F")/~. (9.4) 
The Grassmannian is the space of k-planes in F”. In R” the projection map 


T : Gr(k, R”) 2, VR”), (9.5) 


is a principal bundle with fiber group O(k). The group O(n) acts transitively 
on V;(R), and the isotropy subgroup of the n x k matrix 


A= a 
consists of the matrices in O(n) of the form 

[oo 

0 By’ 
where B € O(n—k). Thus, we have a diffeomorphism, 
O(n)/O(n—k) > Vi (R"), 
and we can write the Grassmannian bundle as, 
r : Gr(k, R”) 2, O(n)/O(n — k). (9.6) 

Equivalently, we have 


Gr(k,R”) = (9.7) 


330 CHAPTER 9. BUNDLES AND APPLICATIONS 


In the complex and quaternionic vector spaces, we have 


say oh U(n) 
Grik ae U(k) x U(n—k)’ 
MES Sp(n) 
Gr(k, H”) = o E. (9.8) 


The Grassmannian of lines in R3 is the projective space RP? and the Grass- 
mannian of planes is the same since to every line through the origin, there cor- 
responds a unique orthogonal plane. Hence, the simplest Grassmannian that is 
not a projective space is the space Gr(2,R*) of planes in R4, or equivalently, 
the space of projective lines in CP®. The space CP? of projective lines in C4 
with a Hermitian metric of signature (+ + ——) is the base space for twistor 
theory; in this case the symmetry group preserving the metric is SU(2, 2) which 
is the double cover of the conformal group. It has been cited by R. Penrose 
[29], that the geometrical foundation for the theory can be traced back to the 
work of Pliicker and Klein on subspaces of planes. In view of this historical 
setting, we present a brief summary of the parametrization of two dimensional 
subspaces of R* used by Pliicker. Given two vectors v = (v!,v?, v3, vt) and 
w = (wt, w?, w3, wt), we can determine a plane by a linear map L(R?, R4) 
with matrix representation given by a 4 x 2 matrix A of rank 2, whose column 
vectors are vT, and wf. That is, 


vi Wi 

V W 
A= Plig 

U3 W3 

V4 W4 


The Grassmannian Gr(2, R4) is the space of equivalence classes of such matri- 
ces. Pliicker coordinates pij are defined by the determinants of pairs of rows 7 
and j. 


Pij = ViW; — VjWi, 
BES 


0 P12 P13 P14 
—pi2 0 P23 P24, 
—pı3 —p2z3 0 P34, 
—pı4 —Pp24 —p3a 0 


We may view the antisymmetric matrix P = (pij) as a element of the Lie 
algebra s04, and the quantities (p12 : p13 : P14 : P34 : P24 : P23) as homogenous 
coordinates in RP® . A short computation using the antisymmetry property in 
the definition of p;;, yields, 


P12P34 — P13P24 + P14P23 = 0 


The square of this entity is equal to det(P). This equation, up to a constant, 
represents a quadric hypersurface Q in RP®. The permutation signs of the 


9.2. PRINCIPAL FIBER BUNDLES 331 


equation give a hint that there is a duality lurking somewhere, namely the 
orthogonal subspaces. Introducing independent coordinates 


pi2 = X +R, p34 = X — R, 
p3 = S—Y, pa =S +Y, 
pia = Z+T, p23 = ZT, 


the equation becomes, 
(X? +Y? + Z?) — (R?4+87+T’) =0 
By re-scaling the homogenous coordinates we can write the quadric Q as, 
KAY eg =l, R+S 47? =1. 
Thus, Q has the topology of a torus S$? x 92. 
9.2.9 Example $° 


The U(n) bundle with n = 2 gives U(2)/U(1) & 93. 
The Grassmannian Gr(1,C?) & CP! & S?, gives the bundle 


uayu() 22, UO) 


s SZ, 2, 


This view of the Hopf bundle is the foundation for the argument that a three- 
sphere S$? is homeomorphic to the union of two solid tori whose intersection are 
their common boundaries with topology St x $1. As a static picture, figures 
such as in 8.3 are as good as it gets in trying to visualize the union of the two 
tori. 


9.2.10 Definition Associated vector bundle 

Let € = {E, M, r, F,G} be a PFB. Suppose there is a vector space V on which 
G acts on the left. Let (e,v) € E x V and g € G. We can define an action on 
the cross product EF x V by 


(e,v)g => (eg, 9" *v), 
Denote (E x V)/G by E xa V. Then the natural projection map, 


np: EXaV OM 


defines a vector bundle called the associated vector bundle. 


9.2.11 Example If E = B(M) is the bundle of frames, and V = R”, the 
associated vector bundle is the tangent bundle. 


332 CHAPTER 9. BUNDLES AND APPLICATIONS 


9.3 Connections on PFB’s 


We have noted earlier, that in a Riemannian manifold, there exists a unique, 
torsion-free connection (see theorem 6.9). The metric on the manifold allows us 
to define orthonormal frames, and the connection gives a prescription on how 
to parallel transport tangent vectors and frames. We now present the bundle 
viewpoint of connections. We will do this quite generally, but the reader should 
keep in mind the bundle of frames as the model space. There is a learning curve 
for the mathematical formalism, but the idea is very intuitive. We illustrate 
this with bundle of frames. Given a point p on a manifold M, the fiber of the 
bundle of frames F'(M) consists of the point and all the frames at that point. 
The action of the general linear group along the fibers, transforms one frame at 
p onto another frame at p. Since the right action of the group is transitive and 
effective, there is a natural way to identify a vertical direction in the tangent 
space at a point b on the bundle, namely, a vertical tangent vector at b on 
the frame bundle, corresponds to a frame at the point p in the base manifold. 
If the frames are restricted to be orthonormal, the picture is the same, but 
one has to reduce the group to the orthogonal group. Thus, the action of the 
group tells us how move frames along the fibers, but it does not tell us how 
to move a frame to the fiber of a nearby point on the manifold. This requires 
more structure, namely a connection. In the case of Riemannian manifold, the 
natural structure is provided by the Levi-Civita connection, which quantifies 
how to parallel transport a frame along any particular curve. Lifting the curve 
and the moving the frames along that curve in the bundle, would then yield 
a section of the bundle that we could identify as a horizontal direction. Thus, 
the basic idea of connection on a principal fiber bundle amounts to choosing 
horizontal direction for the tangent space of the bundle. 

Let € = {E, M, n, F,G} be a principal fiber bundle and let the coordinate 
chart U; give a local trivialization of the bundle 


a +(U;) aN U; x G. 
st 7 
U; 
Let p € U; and b € F, C 7~'(U;), so that 7(b) = p. Here we use the notation 
Qi: nt (U;) >UixG, 
pib) = (m), pi(p)). 


On the overlap U; N U; of two coordinate charts with 


the transition functions y;; = oo -pj give a map 
Pij :U; N Uj —> G, 


Pij 
p — pis (P) 


9.3. CONNECTIONS ON PFB’S 333 


If 


Si : U; = an 4(U;), 
sj : Uj > a (U;), 


are sections of the bundle over sets with U; N Uj # Ø, then on the overlap the 
sections are related by 


sj(p) = si(p) pig (p)- (9.9) 


9.3.1 Definition Let € = {E, M,z, F,G} bea principal fiber bundle and let 
b € Fp be a point on the bundle over the fiber Fp. A tangent vector Y € T,E 
is called vertical if 

TY = 0. 


The vector space V, of all such vectors is called the vertical subspace of TE. 
If X € g, that action of the group G on E induces a fundamental vector field 
Y = o0(X) as defined by equation 7.74. Such a vector field would then yield a 
vertical vector at any point b € Fp. 


9.3.1 Ehresmann Connection 


We now introduce the following, 


9.3.2 Definition Ehresmann connection 
A connection T on a principal fiber bundle is a choice of a subspace Hy of TE, 
such that 

a) For each b € E, we have TE = V, © Hp, 

b) Rgs Hy =R Hrg, 

c) Hy is a C® distribution. 


The vector space Hy is called the horizontal subspace of TE at b, and tangent 
vectors in this space are called horizontal. Condition (a) says that any tangent 
vector Y € T E can be split as a sum of a vertical and a horizontal component 
that we denote as, 

Y =vY +hY. 


Condition (b) says that the distributions b + H, is right-invariant under the 
action of G . A connection on a principal fiber bundle as defined above is called 
an Ehresmann connection. 

Given a connection [ we define a Lie-Algebra valued one form 


wi: E >g 


that for each tangent vector Y € TE, it assigns the unique vector vector X € g 
whose fundamental vector field o(X) is the vertical component of Y. In the 
language of distributions, Hy is the kernel of the map, 


H, = {Y EDE : w(Y) = 0.} 


334 CHAPTER 9. BUNDLES AND APPLICATIONS 


Since the map w is onto for every b € E, the kernel H, is a linear subspace of 
T,E with dimension equal to the dimension of M with w(Yq) = 0. To be clear, 
we are saying, 
X if Y=oa(X 
w(Y) = i 1 o( ), 


0 if Y is horizontal. 


Here we assume that the space Hy annihilated by w is a C™ distribution in the 
sense of Frobenius 7.44. 

Motivated by the formula for the pull-back of the Maurer-Cartan form 7.66 
and by equation 7.75, we can equivalently characterize an Ehresmann connec- 
tion, by the conditions stated in the following, 


Fig. 9.5: Ehresmann Connection 


9.3.3 Theorem An Ehresmann Connection on a principal fiber bundle € = 
{E,M,7,F,G}, is a smooth, Lie Algebra valued form w € 0(£) that satisfies 
the following conditions: 

a) w(o(X)) = X, for all X € g, 

b) Réw(Y) = Ad,-1w(Y), for all g € G, and all tangent vectors Y on E. 


Part (a) is immediate from the definition. Part (b) essentially follows from the 
fact that the fundamental vector field associated with Ry..w is Ad,g-1w. Y can 
be split uniquely into a horizontal and a vertical component. If Y is horizontal, 
then both sides of the equation are zero. If Y is vertical, it is the fundamental 
vector field o(X) for some X € g. Then 


(Ryw)o(Y) = wrg(Rge¥), 
= Wpg(o(Ad,-1X)) by equation 7.75 
= Ad,-1X 
(Row)o(Y) = Adg-1(ws(Y)). (9.10) 
We now show how to pull down a connection w on a principal fiber bundle 
E down to a family of local connections on the manifold. Here we adapt the 
procedure from Kobayashi and Nomizu [18], with apologies to the authors for 


9.3. CONNECTIONS ON PFB’S 335 


diminishing the elegance of their proof, for the sake of clarity provided by adding 
more details. Let {U;} be an open cover of M with coordinate charts {(¢;, U;)}. 
For each i let s;(p) be the section over U; defined by 

si(p) = 6; (p,e), pei, e=ide€G 


Given a connection w in the bundle F, define local connections on M using 
the pullback of the section maps. Thus, over each pair of overlapping charts 
U,NU; Æ Ø, we define, 


Since the transition functions map 
Qij :UiN Uj >G, 


we can pull-back forms. In particular, let 0 be the left invariant Maurer-Cartan 
form on G. We define a g-valued form on U; N Uj by 


Pij :TG > T(U; N U;), 


For applications to gauge theory, following result is very important, 


9.3.4 Theorem On U;NU; 4 Ø, the local forms w; and 4;; on M satisfy the 
condition 


Wj = (Ad,,-1 Jui + Oij. (9.11) 


Proof Given a point p € U; N Uj, let Xp € T (U; NU;) be a tangent vector 
to a curve x(t), with z(0) = p and X = 2’(t). Then the transition equation for 
the sections 9.9 reads 


sj(2(t)) = si(2(¢))piz(2(t)). 
The push-forward 
Sjx(Xp) : Tp(Ui N U;) > T; (p)E> 
is the image of (s;.(X), Yijx(X)) under the isomorphism 
Taim E 8 Tonpa = Ts E. 
More specifically, taking the derivative d/dt and evaluating a t = 0 as done 
with the product rule formula 6.2, we get 


L silet = Sie) vy eA 


= L lsi (a(t) walo) + Sisi) valeto, 


s (X) = É Royo) s00) + Želeo 


336 CHAPTER 9. BUNDLES AND APPLICATIONS 


We now apply w to both side remembering the general definition of the pullback 
s*w(X) =w(s,X). We get 


W(8jx(X)) = W(Ry, (px Six(X)) + (Six (PD) Pijx(X)), 
sjw(X) = RE (py w(Six(X)) + (silp) Yijx(X)), 
wi (X) = Ro, pywi(X)) + (six (P) Gijx(X)), 
W(X) = (Ad p71 )wi(X)) + (six (p) Pijx(X)), 


where in the first term on the right, we have used the condition 9.10 for an 
Ehresmann connection. The second term on the right is a bit trickier. We see 
from the diagram below 


Pi LUR 
Tp Ui N Uj hen Toi; (p)G 
4 4 
E (Ui Uj) T pulp) EG, 
that y,;(a(t) is a curve in G, whose differential map sends the tangent vector 


X at p to a tangent vector in G at y;;(p) 


Xp ae, Pijx(X) 


pis (P) 
On the other hand, one can think of s;(p) as a map from G to E given by 


si(p) 
g—si(p)g. 


Thus s;.(p) pij«(X) is the push-forward of y,;.(X) to TE by the Jacobian 
map six(p) : TG > TE. If Y € g is the left-invariant vector ? in G such that 
Y = ijx(X) at yi;(p), then, we have 


OY) = O(vij.X) =Y 


The image of Y under s;,(p) corresponds to a fundamental vector o(Y), there- 
fore, by the definition of an Ehresmann connection 


w(Six(p) Pijx(X) = w(o(Y)), 
=Y, 
= 0 (Yijs(X)), 
= 0;;(X) 


Specifically, Ye = Le, i (py (vizx(X)y;;(p)) The notation is a bit cluttered but the concept 


is rather simple. The sector Y generates a one parameter subgroup of G whose tangent vector 
at pij(p) concides with y;;.(X). The integral curve of Y induces a fundamental vector field 
o(Y) on the fiber Fp 


9.3. CONNECTIONS ON PFB’S 337 


The converse of the theorem is obtained by reversing the argument above. This 
concludes the proof. 

If G is a matrix group, and on the overlap of the charts U; N U; we denote 
the transition functions y;;(p) by a matrix transformation B, then equation 
9.11 reads 

wj = B-'w,B+ Bo'dB, 


which we immediately recognize as the transformation law for an affine con- 
nection in the manifold, (or a local gauge transformation in the language of 
physics.) Since this holds for any pair of overlapping patches, we see that the 
connection in the bundle gives rise to a family of connections on M which piece 
together as we desire on any overlap. 


9.3.2 Horizontal Lift 


Given a principal fiber bundle € = {E,7,M,G}, and b € E, we define the 
horizontal lift x} of a vector field X € Z (M) to be the a horizontal vector at 
p that projects to X; that is 

a) o(X;) =0, 

b) ma ( XI = Xray: 

The horizontal lift is right translation invariant meaning 

c) Rg X$ = X}, for all b€ E and g € G. 

We have the following, 


9.3.5 Proposition Let X# and Y* be horizontal lifts of X and Y respectively, 
and let f € F (M). Denote by f* the composition f}? : E 5 M É, R. Then 
a) X} +Y! = (X +Y}Ë, 
b) fix! = (FX), 
c) A[X#, Y#] = [x,y]. 
Proof Only part (c) requires a little thinking. The proof rests on the fact that 
the push-forward of the Lie bracket is equal the bracket of the push-forwards, 
as shown in equation 7.25. We have 


T, (h| Xt, Y*)) = Tal hX", hY*], 


To give a better illustration of the horizontal lift of vector fields, consider the 
bundle of frames E = B(M). 

If V is a connection on M, using the notion of parallelism defined ve 6.64, 
we can parallel transport the tangent space along curves in M. Let {z', ae 
be coordinates in a coordinate chart about a point p € M and let Oe = x7} 
be the standard basis for T,M. The horizontal lifts {(0;)*} then constitute 
a basis for the distribution 5 œ> Hy. Let b = (p,e1,...en) E€ B(M), with 


338 CHAPTER 9. BUNDLES AND APPLICATIONS 


ej = A’;0;, A € Gl(n, R), and a(t) be a curve in M with a(0) = p. By parallel 
translation {e;(t) = eilact)} of the frame {e;}|p, we define a curve 


at (t) = (a(t), e1(t),...en(t)). 


Since (moa*)(t) = a(t), we get a horizontal lift a#(t) of a(t). Thus a connection 
on M allows us to get unique horizontal lifts of curves in M. A connection on 
the frame bundle, together with the notion of horizontal lifts of vector fields, 
provide a way to naturally lift curves on M to the bundle. Let a(t) be a curve 
in M with tangent vector field T in a neighborhood U where a is injective. 
Lift T horizontally to T*#, and let aË be the integral curve in B(M). If TË is 
horizontal, that is T? € Hy, then by the properties of the bundle connection, 
Rgs TË = T, is also horizontal. Thus, The curve aË is horizontal independent 
of the point b a t = 0. The horizontal lift defines a parallel transport on the 
manifold. The idea extends to any principal fiber bundle. For a more careful 
treatment, see for example, Kobayashi and Nomizu [18] or Spivak [34]. 


9.3.3 Curvature Form 


Returning the concept of a connection on 
a principal fiber bundle € = {E,7, M,G}, and 
E = {E,7, M, G}, with b € E, we introduce the 
following . 


9.3.6 Definition A form ¢ of degree k in E 
is called a tensorial form of adjoint type if 


Rib = Ady1-¢, for allg €G, 


and ọ(X1,..., Xk) = 0 if at least one of the 
tangent vectors X; in E os vertical. The k + 1 
form D¢ = (dé)h, that is 


Do(X1,..-,Xk41) = dG(AX1,...,AXk+41), 
is a tensorial form D¢ called the exterior covariant derivative of @. 


Now we come the main result of this section. First, we will need, 


9.3.7 Lemma If Y; is horizontal and Y> = o(X2) is a fundamental vertical 
vector generated by X2, then [Y1, Yə] is horizontal. 

Proof Let y, = e'*? be the one-parameter subgroup in G generating Y> by 
right translation Ræ, . Then 


[¥4,¥a] = —[Yo, Yı] = — £y, Y1, 
Yı — xY] 
= — lim + Ro = 
t30 t 


9.4. GAUGE FIELDS 339 


If Yı is horizontal, so is Ry,.¥1, so the left hand side [Y1, Y2] is also horizontal. 


9.3.8 Theorem Let w be a connection on E. Then the curvature form 
defined as Dw satisfies the structure equation 


Dw(¥1, Y2) = dw(Y1, Y2) + [w (Y1), w(¥2)]. (9.12) 


Proof Every vector in Y € Tp can be split into the sum of a vertical and a 
horizontal vector. Both sides of the structure equation are skew-symmetric and 
bilinear, so it suffices to treat the following three cases 

Case 1. Yı and Y> are horizontal. Then w(Y1) = w(Y2) = 0, and AY, = 
Yı, hY2 = yo. Inserting into equation 9.12, we get 


Dw(Yi, Y2) = dwY1, Y2 = dw(hY1, hY2), 


which is precisely the definition of Dw. 

Case 2. Yı and Y> are vertical. By definition, Q(Y1, Y2) = 0, thus we have to 
prove that the right hand side is also 0. Since Y1, Y> are fundamental vector 
fields, there exist vectors X1, X2 € g such that Yı = o(Xı) and Yə = o (X2). So 
w(Y1) = Xı and w(Y2) = X2 are constant. From the definition of the differential 
of a one-form 6.28, we have, 


dw(Y1, Y2) = Yı (w(Y2)) — Y2 (w(Y1)) — w([Y1, Y2]), 
= —w([Y1, Y2]) = —w([o (X1), o(X2)]), 
= —w(o|X1, X2]), by theorem 7.3.4, 
= —[X1, Xe] = —[w(%1), w(¥)]. 


Thus, 

dw(Y1, Y2) + [w(¥1),w(¥2)] = 0. 
Case 3. Yı is horizontal and Y is vertical. By definition Q(Y,, Y2) = 0, so we 
have to show that right hand side is also 0. Extend Y, to a horizontal vector 
field, and let X2 € g be the vector generating Y2 = o(X2). Then as in case 
2, w(Y2) = w(a(X2)) is constant, so Yi(w(Y2)) = 0 and [w(Y1), w(Y2)] = 0. It 
remains to show that dw(Yi, Y2) = 0, We have, 


du(Y1, Y2) = Yi(w(¥2)) — Ya(w(%1)) — (Y1, Y2]), 
= —w([Y¥i, Y2]), 
= 0. by lemma 9.3.7. 


9.4 Gauge Fields 


As described in the historical notes earlier, physicists and mathematicians 
developed the notion of a connection on a principal fiber bundle independently, 
and it wasn’t until the 1970’s that they realized that they were talking about 
the same objects. Here is short lexicon of the corresponding terms used in the 
two disciplines 


340 CHAPTER 9. BUNDLES AND APPLICATIONS 


Mathematics Physics 
Principal fiber bundle | Gauge space 
G structure group Gauge group (such as SU(2)) 
Connection form Gauge potential (such as A ) 
Curvature form Field strength (such as E and B) 
Local trivialization Choice of gauge 
Transition function Change of gauge 


To get to the physical significance of the principal fiber bundle formalism, let 
w be a connection on the PFB, with curvature Dw. Assuming the structure 
group has dimension k, let {e1,...,e,} be a basis for the Lie algebra g. Then 
we can write the components of the connection as w = wea, and the structure 
equation 9.12 reads, 


1 
Q“ = dw” + 5 CBee Aw" 


The a’s in the forms Q“ and w® are Lie algebra indices which reflect the fact that 
the forms are Lie algebra valued. The reader will of course note the similarity 
to the Maurer-Cartan equations 7.68. If we pick a local trivialization {U, 9} 
with local section s : U > E, and label the local forms 


A=sw, F=s*Q, 
we get the expression 
F? = dA° + 20% AP A AT 
= 9 PI 


Better yet, if the local coordinates of the manifold be denoted by {a}, we can 
write the equation above to include the tensor indices, 


AG A% 1 
a _ fe fy “jae Ab a 
uv = po pgn + z CByAy A AD, (9.13) 


which is the familiar form encountered in the physics of Yang-Mills fields. On 
the non-empty overlap of two coordinate charts, w and Q transform as connec- 
tion and a tensorial form should. 


9.4.1 Electrodynamics 


We take a closer look at the special case of electrodynamics. In the classical 
theory of electromagnetism, we find the simplest example of a gauge theory. If 
F is the Maxwell 2-form, then as in 2.115, we have dF = 0. Therefore, by the 
Poincaré lemma, in a simply connected region, there exist a one form A such 
that F = dA. In tensor components, this reads 


Fuy = Op Ay — O, Ay. 
The one form A = A, dx” is not unique because the transformation 


Aw A’ =A+dy 


9.4. GAUGE FIELDS 341 


leaves the strength field F invariant. In vector notation, A, = (¢, A), the gauge 
freedom reads 


A'=A+Vọ, 
_4_ 2% 
p=¢- zp 


and the corresponding fields E and B remain invariant. Thus, one can solve 
the dynamic equations working with A, knowing that the observables are gauge 
independent. The gauge freedom of the electromagnetic field is an asset, rather 
than a liability, for it allows one to adjust the potentials to have properties that 
do not affect the fields. Of these, perhaps the most useful is the Lorentz gauge 
0, A" = 0 that leads to the wave equation 


At = JH 


for the potential, and the polarization states of electromagnetic waves. For time 
dependent fields, the solutions are called the retarded potentials [17] 


Ee a 


? 


~ At lr — r'| 

1 J(t,,r’) 
A(t,r) = — 27 By! 
He) 4n J |r —r'| i 

where, 
r- r'| 
t =t—- ; 
c 


The gauge group is probably more evident in quantum electrodynamics (QED). 
From equation 2.123, the classical electromagnetic Lagrangian is 


1 V 
M GFF + JHA. 


EM 


As shown there, the Euler-Lagrange equations lead to Maxwell equations. The 
Dirac equation 8.81 (with A = c = 1) for electron/positron fields 


(iy"d, — m)U =0 


is generated by the Dirac Lagrangian for fermion fields of spin Z and mass m, 


Ly = Yhig" ð, — M)Y. (9.14) 


The Dirac Lagrangian is invariant under the phase transformations 


pla) = y (2) = iela), B(x) P (a) = e(a). 


This is called an internal global symmetry, where global refers to the symmetry 
being independent of the position, and internal to the symmetry not changing 
the location. Since e’®* is unimodular, the gauge group is U (1). On the other 


342 CHAPTER 9. BUNDLES AND APPLICATIONS 


hand, if we want to impose a local symmetry by letting A(x) depend on x, the 
Dirac Lagrangian transforms as 


L, = L = lig" (3, — ied,,d) — m)y. 


To make the new Lagrangian invariant requires the introduction of a covariant 
derivative operator 
Vp =O, + 1eA,, (9.15) 


with a corresponding modification to the Lagrangian 


Ly =YV p- mye. (9.16) 


Finally, if we add the electromagnetic Lagrangian, we get the full QED La- 
grangian 
Loep = plig” V p = my)p = ee E. (9.17) 


Thus, local invariance leads to a coupling with the electromagnetic potential A” 
which now can evidently be interpreted as connection on a U(1) bundle, thus 
providing a mechanism for covariant derivative along the sections of the bundle. 
The QED lagrangian can also be written to elucidate better the coupling to E& 
M, as 

Lono = wid, = m)y = i Fp F” F ApJ”, (9.18) 
where J” = epy”. If the structure group is replaced by G = U (1) x SU (2) 
we get the Weinberg-Salam standard model; if the group is enlarged to G = 
SU (3), we get Quantum Chromodynamics (QCD). In either case the Lagrangian 
requires only a modification for the curvature form F to have an extra index to 
indicate that it is Lie algebra valued. Thus, the QCD Lagrangian is 


Lon = Yli Va — my — IFFR.. (9.19) 
As before, the field strength F’ is the curvature of the connection 
F = DA = dA + +ie A ^A, 


but compared electromagnetism, the wedge/bracket makes this a non-Abelian 
gauge theory. 


9.4.2 Dirac Monopole 


There are no magnetic monopoles, but if there were, we would like the fields 
to satisfy an extended Maxwell equation 


V - B = 4T pm, 
where pm = gô(r) is the point density of magnetic charge. Then, the solution 


is a 1/r? law 
B=9— 


r3 


9.4. GAUGE FIELDS 343 


Let F be the electromagnetic 2-form for a pure magnetic field. By Stokes’ 
theorem the flux over a closed surface R bounding a volume V, is 


r= | F= | Bas=/ v-Bay, 
R R V 


F =B- dS = $ (æ dy ^ dz + y dz ^ dæ + z dz A dy). 
r 


where, 


If we constrain F to a 2-sphere centered at the origin and convert to spherical 
coordinates, the form F simplifies to 


F = gsin0 dé ^ dọ. 


There is of course no globally defined potential for F because that would imply 
that dF = 0 and that is no longer true. Still, we seek local forms A with 
dA = F. Since up to the constant factor g, F is the curvature form for a 
sphere, as shown in example 4.5.9, the natural candidate are the components 
of the Cartan connection form 

A = —g cos 0 dọ 


trial 


Unfortunately the potential has singularities that might not be apparent in 
spherical coordinates, but become evident in Cartesian coordinates 


z xdy-— ydr 
r3 x? + y? 


A =-g 


This is the same problematic form 2.82 which we noted as a standard coun- 
terexample in the discussion of the Poincaré lemma. The form is singular along 
x? +y? = 0. In Dirac’s original construction of the monopole solution, he al- 
lowed for the singularities by essentially cutting of the lower z-axis, a set usually 
called a Dirac string. The modern approach circumvents the singularities by 
constructing a connection on a Hopf bundle. Let (z!, z2?) be coordinates on C?, 
with 
z! =g! + ix’, 2 = r + iat. 
and define CP’ as in section 8.1.4 by the quotient C/ ~ with the equivalence 
class 
(gg ce Ce" Ne"), AEC, 


and constraining to the three sphere S° : |z!|? + |z?|? = 1. Following the 
convention in equation 5.60, let 

a+ iy 

l-z 


a= 


be the complex number associated with point p(x, y,z) € S? under the stereo- 
graphic projection from the north pole. This gives a coordinate chart for S?, 
but the chart misses the north pole. To cover the sphere, we create another 
chart by a stereographic projection from the south pole. By the same process 


344 CHAPTER 9. BUNDLES AND APPLICATIONS 


of ratio and proportions for similar right triangles, we find that the complex 
number Çə that represents the same point in the sphere is given by 


x — iy 


&2= l+z 


The minus sign in the y coordinate is needed to preserve the orientation of the 
coordinate axes. The chart based on the south pole maps the north pole to 0 
and the south pole to oo, so we should expect the change of variables to behave 
like the conformal inversion f(z) = 1/z in the complex plane. Not surprisingly 
this is exactly what we get, 


an ty 
~ [+z = 62. 


Thus, we can form a cover of S? by two charts {U,,¢,} and {U2, C2} that 
overlap on an infinitesimal neighborhood of the equator, 


U: = {(0,¢): § —e<@ <a}, 
Ur = {(0,¢):0<0<2 +e}, 


where in the overlap, Çə = 1/¢,. If one sets ¢; = z1/z?, the Hopf fibration 
S3 5 92 is obtained by associating a point 1(z!,z?) on S? with the inverse 
image of the stereographic projection 7; !(z1/z2) of S? to C. It is unavoidable 
to have a minor index inconsistency in the chart labels in the sense that Çı is 
associated with the projection from the north pole, but the coordinate chart U1 
is about the south pole. The bundle charts of S? + $? are given by 


grim (Ui) > U1 x U(1), o(z4,2”) = (Ci; fn), 
$2 : 17 (U2) > Up x U(1), $(z",2”) = (G2, fa), 


We can now perform the flux integrals on the two overlapping hemispheres from 


the poles to an angle 0 
ð = [fF = —2rg(1 + cos 8), 


w= | [r= 2mg(1 — cos6). 


Using Stoke’s theorem, f A -dr = f f F and using the symmetry around a 
parallel circle C on S? at fixed angle 6, we can set A = Agey. The line integrals 


9.4. GAUGE FIELDS 345 


yield the components of the two vector field potentials with corresponding local 
connections 


1 
(A1)o = sae Ee), or A, = —g(1+cos6) dd, 
rsin 
1 — cos 0 
ne g( cos ) or Ag = g(1—cos@) dd. (9.20) 
rsin 


As before, the singularity structure of the connections is more evident in Carte- 
sian coordinates. Multiplying and dividing equations 9.20 by r, we get 


1 1 
Aj = “git + 7 cos 0)dd, Ao = Ge — r cos 0)dd, 
Z L jady — ydx = me sty — ydx 
Gg Pe page =gz(r -z E 
_ eRe E dr) and Sige e dæ) 
I z UTAN SIr a yaz, 
1 
Cae y — ydx) Cae as y — ydx) 


Now we construct a twisted U(1) principal fiber bundle with local trivializations 
ma ~1(U,) and 7~!(U2) with transition functions ¢12 on the overlap given by 


boi fE ,nEeZ 


Here, the transition functions are unimodular, so they can be written as e’”? € 
U(1). The number n is required to be an integer to insure smooth gluing on the 
overlap. If n = 0 we get a trivial bundle S? x S'. If n = 1 we get the standard 
Hopf bundle. If sı and s2 are sections over U and U2 respectively, we get an 
Ehresmann connecton on the bundle with 


Aj=sjw, and Á= sw. 
On the overlap, the connection transformation 
Ao = $2 A1012 + $2 dora, 


just reads 
Ag ER Ay = 2gd¢. 


As @ goes around the equator, we must have 


27 
if 29¢ do = 2nz, 
0 


which means that 
24g=nEZ 


346 CHAPTER 9. BUNDLES AND APPLICATIONS 


This is Dirac’s quantization condition for magnetic monopoles. The integer n 
corresponds to the first Chern cı class of the bundle. If in addition the particle 
has an electric charge, the wave function must satisfy Shcrddinger’s equation 


1 e 
— (p — —A)?|%) = Ely). 
— p- £A)?1) = Ely) 
Under the gauge invariance A > A + VA, given wave functions |) and |2) 
on U; and U2, they must transform as 


li) =e |bo), A= 29d 


where we have restored fh and c to standard units. For a fixed value of 6, as 
the wave functions go from 0 to 27, the requirement that |p} be single-valued 
implies that 
2eg _ 
mo 
That is, for a singly charged monopole (n = 1), the ratio of the magnetic to the 
electric charge is given in terms of the fine structure constant 
g lfc 137 
ssp ae 69; (9.21) 
an amusing result which I first learned from professor Raymond Sachs in a 
problem assigned in my upper division course in E &M. The magnetic monopole 
is often called a topological charge, for it essentially arises from the classification 
of the transition functions with are basically continuous functions from the 
equator S! to U(1) and hence, they correspond the fundamental homotopy 
group 7 (U;) =Z 
It is worthwhile to view the monopole connection via the complex structure 
of the base space. Observe that other than a “Lie algebra” factor of i, the Dirac 
monopole 2-form is given by the Kahler form 5.65 
dt ^ dé 


TLE 


We thus expect to have a complex gauge complex potential that leads to this 
2-form. First, we use of Euler angles 


zı = +e) cos £, (9.22) 


za = el 9)/2 sin 8, (9.23) 
to write the equation |z1|? + |z2|? = 1 of the three sphere 9°. The induced 
Riemannian metric on S? is given 
ds? = 4(dzıdzı + dZ2dz2), 
= d0? + sin? 6 dø? + (dy + cos@ do)’. 
The form 
w = dw +cosé dd 


9.4. GAUGE FIELDS 347 


defines a connection on S viewed as an St bundle over S?. This natural a 
connection in complex terms, is induced by the restriction of the form 


zldz! + zdz? 


from C? to S°. The curvature of this connection 


Q = sind dd A dd, 
_ dC Ade 


= i. 
(1+ ¢¢)? 

extended to Minkowski space, corresponds to a magnetic monopole of field 

strength 1. By a short computation, we obtain the real and imaginary parts of 

this form. We get 


S (zdz! + dz") = —2? daz! + x! dx? — ztdr? + tdr’. 


Thus we set 
w = i(—a? da! + r'dr? — zdr’ + ztdzr’). 


We verify our assertion by looking at the pullback of w for the sections sı and 
s2 under de bundle trivialization constructed of hemispheres of S? overlapping 
over an infinitesimal band around the equator. We leave to the reader to verify 


that Eade 
Recall that l 
¢1 = cot($) e. 


Then, as in the computation leading to the Fubini-Study metric 5.63, we have 
Ç= cot($) e, 
d¢ = —4 csc? (£) e” do + icot(£) e do 
S (Ç dc) = cot? ($) do, 
1 
+ cos 0 d 
1 — cos 0 


On the other hand 


2 


1+¢¢= 1+ cot?($) = esc"($) = 7G 


Combining these results together, we get 


A; = ig(1 + cos 0) dọ 


348 CHAPTER 9. BUNDLES AND APPLICATIONS 


For the chart based on the stereographic projection from the South pole, we 
use 

C2 = tan($) e’, 
A completely analogous computation gives the local gauge potential 


Az = —ig(1 — cos 0) dọ 


9.4.3 BPST Instanton 


The complex version of the Dirac monopole introduced in last part of the 
section above can be extrapolated to the quaternionic projective space HP’. 
Let (q1, q2) € H’, and define the equivalence relation 


(41,92) ~ (Aq,Ag@), AEH 


The quaternionic projective space HP! is defined by the quotient H? / ~ with 
this equivalence relation. Effectively, the projective space is the space of quater- 
nionic lines through the origin. The restriction 


lal? +l? = 1 


defines a unit sphere S7 centered at the origin. Extending the crude visualiza- 
tion shown in 8.2 for the complex projective space, the intersection of quater- 
nionic lines with S7 yield three spheres $?. The restriction implies that À € H 
is a unit quaternion, and the set of unit quaternions is the unitary group 


Sp(1) = U(1, H) > SU(2) 
This leads to the quaternionic Hopf bundle 
S? > Ss", S4 S HP’. 
There is a natural sp(1)-valued connection on the bundle defined by 
w =S (q'dq* + Pdq’) 


where again, we neglect the real part, since that vanishes on S7. Using the 
stereographic projection from the North and South poles respectively, cover S* 
by two quaternionic charts {U1,¢,}, and {U2,¢,}, which overlap on a narrow 
band around the S$? equator. Of course, in physics the preferred parametrization 
of this S° is by using Euler angles, and the action of SU(2) on the bundle charts 
are by right multiplication with Euler angle matrices Q. On the overlap in the 
base space, the transition functions are 


So 


The BPST instanton bundle corresponds to n = 1 If s : U 4 m™t(U) is a 
section of the bundle over one of these charts, we get a connection 


qd 
A=stw=9 (4 L) (9.24) 


9.4. GAUGE FIELDS 349 


This is the gauge potential for the famous BPST instanton. The pullback F 
P=s*Q=dA+AAA 


of the curvature 2 in the bundle, represents the field strength. Since the chart 
is locally Euclidean through the stereographic projection, we may think F as a 
connection on R4. On R4, the connection is anti-self-dual in the sense of the 
Hodge x map, 

xF =—-F 


References 


10 


11 
12 


13 


H. Abrams 1971 The World of M. C. Escher, Meulenhoff International, 
Netherlands (1971) 


Atiyah, M., Hitchin, N., and Singer, I. 1978 Self-duality in four-dimensional 
Riemannian geometry, Proc. R. Soc. Lond. A. 362, (1978) pp 425-461. 


Belavin, A., Polyakov, A., Schwartz, A., and Tyupkin, Y. 1975 Pseudopar- 
ticle Solutions of the Yang-Mills Equations, Physics Letters B, Vol. 59, No, 
1, (1975) pp 85-87. 


Chern, S. S. 1975: Introduction to Differential Geometry. Notes, University 
of California, Berkeley, CA. 


Chern, S. S. 1944 A Simple Intrinsic Proof of the Gauss-Bonnet Formula 
for Closed Riemannian Manifolds, Ann. Math. 45 (4), 1944, 747-752. 


Chern, S. S. 1955 An Elementary Proof of the Existence of Isoterhmal 
Parameters on a Surface, Proceedings of AMS, Vol. 6, No, 4, (1955), 771- 
782. 


Dray, T. 1984 1966 The Relationship between Monopole Harmonics and 
Spin-weighted Spherical Harmonics, Journal of Mathematical Physics, 26, 
1030 (1985). https://doi-org/10.1063/1.526533 


Eguchi, T., Gilkey, P., and Hanson, A. 1980 Gravitation, Gauge Theories 
and Differential Geometry, Physics Reports, 66, No, 6 (1980). pp 213-293 


Georgi, H. 1999: Lie Algebras in Particle Physics. Frontiers in Physics 
V54, Westview (1999). 


Goldberg, J., Mcfarland, A., Newman, E., Rohrlich, F. and Sudarshan, C. 
1966 Spin-s Spherical Harmonics and ð, Proceedings of AMS, Vol. 6, No, 
4, (1955), 771-782. 


Goldstein, Herbert, 1950: Classical Mechanics. Addison-Wesley, 398pp. 


Grey, Alfred, 2006: Modern Differential Geometry of Curves and Surfaces 
with Mathematica. 3rd ed. Chapman & Hall, CRC, 977pp. 


Hall, B. 2015: Lie Groups, Lie Algebras, and Representations. 2nd ed. 
Springer, Graduate Texts in Mathematics 222, Switzerland, (2015). 


350 


REFERENCES 351 


[14] 


[15 


16 


17 


18 


19 


20 


21 


22 


23 


24 


[25] 


26 


27 


28 


29 


Hicks, Noel, 1965: Notes on Differential Geometry. Van Nostrand Rein- 
hold, Princeton, NJ. 


Hirsh, A. 2002 Extension of ’Villarceau-Section’ to Surfaces of Revolution 
with a Generating Conic , J. of Geometry and Graphics. Vol 6, (2002) No, 
2, pp121-132. 


Hoffman, K. and Kunze, R., 1971: Linear Algebra. 2nd ed. Prentice-Hall, 
407pp. 


J. D. Jackson, 1962: Classical Electrodynamics. 


Kobayashi, S and Nomizu, K 1963: Foundations of Differential Geometry. 
Wiley & Sons, New York, London, 1963. 


Eisenhart, L. P. 1960: A Treatise on the Differential Geometry of Curves 
and Surfaces. Ginn and Company, Boston, 1909. Reprint by Dover Publi- 
cations, New York, 1960. 


Abraham, R. and Marsden, J. 1978: Foundations of Mechanics. 2nd ed. 
Addison-Wesley, 838pp. 


Misner, C. W., Thorne, K. S., and Wheeler, J. A. 1973: Gravitation. W.H. 
Freeman and Company, 1279pp. 


Morris, M.S. and Thorne, K.S., 1987: Wormholes in Spacetime and their 
use for Interstellar Travel: A Tool for Teaching General Relativity. Am. J. 
Phys., 56(5), pp395-412. 


Müller, Thomas, 2008: Exact geometric optics in a Morris-Thorne worm- 
hole spacetime. Phys. Rev. D, 77, 044043-1-044043-11. 


Newman, E. and Penrose, R. 1962. An Approach to Gravitational Radiation 
by a Method of Spin Coefficients, Journal of Mathematical Physics 3, 566 
(1962), pp 556-578. https://doi.org/10.1063/1.1724257 


Newman, E. and Janis, R. 1962. Note on the Kerr Spinning-Particle 
Metric, Journal of Mathematical Physics 6, 915 (1965), pp 915-917. 
https://doi.org/10.1063/1.1704350 


O’Neill, Barrett, 2006: Elementary Differential Geometry. 2nd ed. Aca- 
demic Press, 503pp. 


Oprea, J., 1997: Differential Geometry and Its Applications. Prentice-Hall, 
Englewood Cliffs, NJ. 


Oprea, J., 2000: The Mathematics of Soap Films. Student Mathematical, 
Library, Volume 10, AMS, Providence, RI. 


Penrose, R. 1987. On the Origins of Twistor Theory. Gravitation and Ge- 
ometry, a volume on honour of I. Robinson, Biblipolis, Naples, 25pp. 


352 


[30] 


[31 


[32 


33 
34 


35 


36 


37 


38 


39 


40 


REFERENCES 


Penrose, R. 1976. Nonlinear Gravitons and Curved Twistor Theory, Gen- 
eral Relativity and Gravitation 7 (1976), No. 1 pp 31-52. and Geometry, a 
volume on honour of I. Robinson, Biblipolis, Naples, 25pp. 


Rogers, C and Schief, W. K., 2002. On Regular Solutions of Euclidean 
Yang-Mills Equations. Physic Cambridge University Press. 


Schwarz, A, 1977. Backlund and Darboux Transformations : Geometry and 
Modern Applications in Soliton Theory. Physics Letters B, Vol. 67, No, 2, 
(1977) pp 172-174. 


Spivak, M. 1965: Calculus on Manifolds. Addison Wesley (1965), 159pp 


Spivak, M. 1979: A comprehensive introduction to Differential Geometry 
5 volume set. 2nd ed. Publish or Perish, Texas. 


Struik, Dirk J., 1961: Lectures on Classical Differential Geometry. 2nd ed. 
Dover Publications, Inc., 232pp. 


Taub, A. H. 1939: Tensor Equations Equivalent to the Dirac Equations. 
Annals of Mathematics, Vol 40, Number 4, pp.937-947. 


Trautman, A. 1977: Solutions of the Maxwell and Yang-Mills Equa- 
tions Associated with Hopf Fibrings. International Journal of Theoretical 
Physics, Vol 16, Number 8 (1977), pp.561-565. 


Terng,C and Uhlenbeck, K, 2000: Geometry of Solitons. Notices of the 
AMS, Vol 47, Number 1. 


Ward, R. S., 1977: On Self-dual Gauge Fields. Physics Letters A, Vol. 61, 
No, 2, (1977) pp 81-82. 


Weisstein, E. W. From Mathworld- A Wolfram Web Resource: [Avail- 
able online at http: //mathworld.wolfram.com/CostaMinimalSurface. 
htm1.] 


REFERENCES 353 


Index 


Acceleration 
along a curve, 11 
Centripetal, 19 
Adjoint 
Map, 258 
Representation, 258 
Angular momentum 
Commutation relations, 289 
Definition, 288 
Operator, 292 
tensor, 296 
Associated vector bundle, 331 


Baseball curve, 176 
Beltrami equation, 168 
Bianchi identities, 195, 196 
Bianchi permutability, 152, 155 
Bianchi transform, 151, 153 
Bjorling surface, 179 
Bour surface, 181 
BPST Instanton, 348 
Bundle 
Cotangent, 35 
Dual basis, 35 
Fiber, see Fiber bundle 
Section, 2 
Tangent, 2, 99 
Tensor, 38 
Bundle of frames, 326 
Bundle of orthogonal matrices, 328 
Bundle of unitary matrices, 328 
Backlund Transform, 147-158 
Classical formula, 150 


Campbell-Baker-Hausdorff, 256 
Cartan equations 
Connection form, 88, 191 


Curvature form, 89 
First structure equations, 87 
for surface in R3 , 128 
in N-P formalism, 304 
in NP formalism, 306 
in principal bundle, 339 
Manifolds, 193 
Second structure equation, 88 
Cartan magic formula, 245 
Cartan subalgebra, 264, 317 
Casimir operator, 290 
Catenoid 
First fundamental form, 104 
Helicoid curvature, 133 
Minimal surface, 177 
Cauchy-Riemann equations, 164, 166 
Cayley-Klein parameters, 273, 280 
Christoffel symbols, see Connection 
Circle 
Curvature, 18 
Frenet frame, 18 
Clairut relation, 217 
Clifford algebra, 300, 303 
Complex structure 
of surface in RÌ , 165 
Cone 
First fundamental form, 105 
Geodesics, 217 
Conformal map, 163-165 
Definition, 165 
Jacobian, 164 
Mercator, 101 
Stereographic, 169 
Conformal tensor, 304 
Conical helix, 32, 105 
Conjugate harmonic, 164, 166 


304 


INDEX 


Connection 


Affine, 201 
Change of basis, 91, 204, 205 


Christoffel symbols, 83, 122, 125, 


190, 205 
Compatible with metric, 82, 84 
Curvature form, 89 
Eheresmann, 333 
Frenet Equations, 86 


Fundamental theorem, 22—28 


in R3 , 10-22 
Isotropic, 174 
Natural equations, 28 
Plane, 20 

Curvilinear Coordinates, 78 


de Rham 
Cohomology, 247 


Koszul, 81 Complex, 246 
Levi-Civita, 121, 190 de Sitter space, 199 
Linear, 200 Deformation retract, 247 
Parallel transport, 121 Determinants 

Spin, 305 By pull-back, 65 


Contractible, 247 
Contraction, 40 
Coordinate 


Cylindrical, 41 
Functions, 1 
Geodesic, 132 
Isothermal, 134, 166 
Local, 7 

Minkowski, 42 
Polar, 8, 30, 58, 60 
Slot functions, 10 
Spherical, 41, 60, 291 
Transformation, 7 
Weierstrass, 175 


Coordinate patch, see Patch 
Cornu spiral, 29 

Costa’s surface, 183 
Covariant derivative 


Divergence, 124 
Tensor fields, 84 
Vector fields, 81, 191 


Covariant differential, 201 


of surface normal, 131 
of tensor-valued 0-form, 202 
of vector field, 202 


Definition, 46 
Levi-Civita symbol, 47 
of matrix exponential, 160 

Developable surface 
Definition, 139 
K=0, 139 

Diffeomorphism, 7 

Differentiable map, 7-9 
Jacobian, 7 
Push-forward, 8 

Differential forms 
Alternation map, 52 
Closed-Exact, 63, 247 
Covariant tensor, 36 
Dual, 64 
Maxwell 2-form, 71 
n-forms, 51 
One-forms, 34 
Pull-back, 56 
Tensor-valued, 53, 199 
Two-forms, 43 

Dini’s surface, 154 

Dirac equation, 300 

Dirac monopole, 342 

Directional derivative, 4, 240 


355 


Covering space, 322 

Curvature 
Form, see Cartan equations 
Gaussian, see Gaussian curvature 
Geodesic, 107 
Normal, 107-109, 112 
of a curve, 15 

Curves, 10-33 


Distribution, 333 
Definition, 251 
Integrable, 251 

Dolbeault operator, 171 

Dual forms 
Hodge operator, 64 
In R?, 67 
In R3, 67 


356 


In R”, 65 
In Minkowski space, 69 
Dual tensor, 73, 295 


Einstein equations, 209 
Einstein manifold, 209 
Enneper surface, 175 
Eth operator 
Definition, 313 
in spherical coordinates, 314 
Raising-lowering, 314 
Spin-weight, 313 
Euler angles, 271 
Euler characteristic, 231 
Euler’s theorem, 119 
Euler-Lagrange equations 
Arc length, 214 
Electromagnetism, 74 
Minimal area, 159 
Exponential map 
Definition, 255 
Integral curve, 256 
Exterior covariant derivative 
Cartan equations, 204 
Definition, 203 
on principal bundle, 338 
Exterior derivative 
Codifferential, 66 
de Rham complex, 70, 246 
of n-form, 54, 239 
of one-form, 54, 194, 339 
of two-form, 239 
Properties, 54, 240 


Fiber bundle 
Definition, 320 
Line, 322 
Map, 324 
Pull-back, 324 
Section, 322 

Flow, 238 

Foliation, 251 

Frame 
Cylindrical, 77 
Darboux, 106 
Dual, 76 
in R”, 75 


INDEX 


Orthonormal, 76 
Spherical, 77, 291 
Frenet frame, 15-22 
Binormal, 15 
Curvature, 15 
Frenet equations, 16 
Osculating circle, 19 
Torsion, 16 
Unit normal, 15 
Fresnel diffraction, 30 
Fresnel integrals, 21 
Frobenius theorem, 252 
Fubini-Study metric, 170, 313 
Fundamental vector, 266, 333 


Gauge fields, 339 
Electrodynamics, 340 
Yang Mills, 340 

Gauss equation, 188 

Gauss map, 117 
Minimal surfaces, 181 

Gauss-Bonnet formula, 229 

Gauss-Bonnet theorem, 227—232 
For compact surfaces, 231 

Gaussian curvature 
by curvature form, 128 
by Riemann tensor, 127 
Classical definition, 114 
Codazzi equations, 128 
Gauss equations, 120 
Geodesic coordinates, 132 
Invariant definition, 117 


Orthogonal parametric curves, 130 


Principal curvatures, 116 
Principal directions, 116 
Theorema egregium, 126-133 
Torus, 129 
Weingarten formula, 126 
Gell-Mann matrices, 316 
General linear group, 234 
Genus, 231 
Geodesic 
and isometries, 244 
Coordinates, 132 
Curvature, 107, 228 
Definition, 206 


in orthogonal coordinates, 215 


INDEX 


Torsion, 107 
Good cut equation, 313 
Gradient, 41 

Curl Divergence, 50 
Grassmannian, 329 


Half-flat space-time, 313 
Helicoid 
Catenoid curvature, 133 
First fundamental form, 104 
Tangential developable, 139 
Helix, 10 
Frenet frame, 20 
Unit speed, 15 
Henneberg 
Surface, 177 
Hilbert theorem, 142 
Hodge decomposition, 67 
Holomorphic function, 164 
Holonomy, 189, 192, 228 
Homotopy, 247, 325 
Hopf bundle, 323, 328, 331, 343, 345 
Quaternionic, 348 
Hopf fibration 
on HP! , 284 
on CP!, 280 
on RP}, 278 
Horizontal subspace, 333 


Inner product, 38-42 
bra-ket vectors, 40 
k-forms, 66 
Polarization identity, 24 
Standard in R”, 24 

Integral Curve, 7 

Integral curve, 237, 256 

Interior product, 40 
Properties, 245 

Inverse function theorem, 9 

Isometries, 24, 165 
Also , see Killing vector 

Isothermal coordinates, 166-168, 173 
Existence, 166 

Isotropy subgroup, 265 


Jacobi equation, 132 
Jacobi identity, 251 


357 


Jacobian, see Push-forward 


Kähler manifold 
Kähler form, 171 
Kähler potential, 171 
Kerr metric 
Boyer-Lindquist coordinates, 308 
Connection forms, 313 
Ergosphere, 309 
Kerr-Schild form, 308 
Killing form, 259 
Killing vector 
Definition, 244 
Lie subalgebra, 251 
Kuen surface, 155 


Ladder operators, 290 
Lagrangian 
Arc length, 214 
Dirac, 341 
Electromagnetic, 73, 341 
Quantum Chromodynamics, 342 
Quantum Electrodynamics, 342 
Laplacian 
Beltrami, 124, 125, 167 
by dual forms in R, 69 
Harmonic function, 125 
Isothermal, 166 
on forms, 66 
Orthogonal coordinates, 80 
Spherical coordinates, 79, 124, 292 
Legendre polynomials, 292 
Levi-Civita symbol, 47, 295, 298 
Lie algebra 
so(1,3), 295 
so(2, R), 270 
so(3, R), 271 
su(2), 273 
su(3), 316 
Definition, 251 
Homomorphism, 255, 256, 267 
of a Lie group, 253 
Simple, semisimple, 263 
Subalgebra, 251 
Lie bracket 
as a Lie derivative, 241 
Definition, 114 


358 


Lie derivative 
of 1-form, 240 
of function, 240 
of tensor field, 241, 243 
of vector field, 240, 241 
Properties, 243 
Lie group 
SO(2,R), 88, 164 
SU(2), 272 
SU(3), 315 
Definition, 233 
Left invariant, 253 
Left-right translation, 253 
Liebmann theorem, 143 
Linear Derivation, 4 
Lines, 10 
Logarithmic spiral, 30 
Lorentz group, 294-303 
Lorentz transformation, 294 
by spinors, 306 
Infinitesimal, 295, 303 
Lorentzian manifold 
Definition, 208 
Einstein tensor, 209 
Ricci tensor, 208 
Scalar curvature, 209 
Loxodrome, 32, 101 


Mobius band 
as fiber bundle, 321 
Manifold 
Definition, 94 
Differentiable structure, 95 
Product, 186 
Riemannian, see Riemannian 
Submanifold, 95, 188-199, 251 
Maurer-Cartan equation, 262 
Maurer-Cartan form 
Definition, 261 
in SO(2,R), 88 
Maxwell equations, 71—74 
Maxwell spinor, 302 
Mean curvature 
Classical definition, 114 
in isothermal coordinates, 166 
Invariant definition, 117 
Metric 


INDEX 


Cylindrical coordinates, 41 

Metric tensor, 38 

Minkowski, 42 

Riemannian, see Riemannian 

Spherical coordinates, 41 
Minimal surfaces, 173-184 

and stereographic projection, 181 

Conjugate, 180 

Definition, 129 

Gaussian curvature, 180 

Isometric family, 181 

Minimal area property, 158 

Table of Weierstrass parameters, 

182 

Minkowski space, 42 

Dual forms, 69 
Morris-Thorne 

Coframe, 218 

Connection forms, 219 

Curvature forms, 219 

Geodesics, 220 

Ricci tensor, 219 

Wormhole metric, 218 
Mobius band 

as a ruled surface, 137 


N-P Formalism, 304 
Natural equations, 28-33 
Cornu spiral, 29 
Logarithmic spiral, 30 
Meandering curve, 32 
Newman-Penrose equations, 306 
Null tetrad, 304 
Null vector, 301 


One-parameter 
Flow, 238 
Group of diffeomorphisms, 236, 240, 

265, 266 
Subgroup of Lie group, 255, 256, 
273, 338 

Orthogonal 
Basis, 24 
Parametric curves, 99 
Transformation, 25, 76 

Orthogonal group, 234, 269-294 


Parallel 


INDEX 


Section, 208 
Transport, 207 
Vector field, 206 
Patch 
Associated harmonic family, 166, 
181 
Conjugate, 166 
Definition, 93 
Degenerate, 153 
Holomorphic, 165, 173, 181 
Isothermal, 173 
Monge, 126, 158 
Tchebychev, 147 
Weierstrass, 175, 180 
Pauli matrices, 273 
Pauli-Bloch vector, 277 
Permutation symbol, see Levi-Civita 
Petrov classification, 307 
Pfaffian system, 252 
Phase transformation, 341 
Pliicker coordinates, 330 
Poincaré group, 235 
Poincaré lemma, 64, 247 
Principal fiber bundle 
Definition, 325 
Ehresmann connection, 333 
Horizontal lift, 337 
Projective space, 323, 328, 330 
Pseudosphere 
Area, volume, 145 
First fundamental form, 143 
Gaussian curvature, 143 
Parametric equation, 103 
Second fundamental form, 145 
Pull-back 
Chain rule, 57 
Definition, 56 
Determinant, 65 
Line integrals, 57 
of volume form, 57 
Polar coordinates, 58 
Properties, 56 
Surface integrals, 59 
Tensor field, 239 
Push-forward 
Jacobian, 8, 239 
Tensor fields, 239 


359 


Vector field, 99, 238 
Pythagorean triplets, 172 


Quaternion 
Conjugate, 275 
Definition, 274 
in Hopf map, 284 


Raising-lowering operator, 290 
Rasing-lowering operator, 264 
Representation 
Definition, 258 
Irreducible , 258 
Ricci flat, 209 
Ricci identities, 192 
with torsion, 203 
Ricci rotation coefficients, 304 
Riemann sphere, 301 
Bloch sphere, 281 
Complex structure, 170 
metric, 170 
Riemann tensor 
Components, 127, 191, 195 
Symmetries, 195 
Riemannian 
Connection, 188 
Hypersurface, 188 
Manifold, 185 
Metric, 185 
Product manifold, 186 
Riemann tensor, 189 
Second fundamental form, 188 
Structure equations, 193 
Submanifold, 188 
Theorema egregium, 193 
Torsion tensor, 189 
Robinson congruence, 283 
Rodrigues formula, 292 
Root diagram, 264, 318 


Roots 
su(3), 317 
Rotations 
by quaternions, 277 
in R? , 269 
in R3 , 271 


Ruled surface 
Cordinate patch, 136 


360 


Distribution parameter, 139 
Gaussian curvature, 138 
Hyperboloid, 137 

Saddle, 136 

Stricture curve, 139 


Scherk’s surface, 159 
Schrodinger equation, 287 
Schwarz inequality, 25 
Schwarzschild 

Bending of light, 226 


Eddington-Finkelstein coordinates, 


210 

Geodesics, 221 

Metric, 220, 221 

Precession of Mercury, 226 
Second fundamental form 

Asymptotic directions, 109 

by covariant derivative, 115 

Classical formulation, 109 

for K = —1, 145 

Surface normal, 106 
Sectional Curvature, 196 
Simplex, 63 
Sine-Gordon 

Soliton, 147 

Surface with K = —1, 146 
Singular cube, 63 
Soliton 

Moving, 154 

One-soliton solution, 153 

Two-soliton solution, 155 
Space of constant curvature, 197 
Sphere 

Coordinate chart, 95 

Euler characteristic, 231 

First fundamental form, 101 

Gauss map, 117 

Gaussian curvature, 119, 129 

Geodesics, 214 

Loxodrome, 101 

Orthonormal frame, 77 

Second fundamental form, 110 

Structure equations, 89 

Temple of Viviani, 10 

Total curvature, 230 
Spherical harmonics, 292 


INDEX 


Spin-weighted, 314 
Spin coefficients, 305 
Spin dyad, 305 
Spinor 
2-spinor, 297 
4-spinor, 300 
Identities, 299 
Spin space, 297 
Symmetric, 301, 302 
Transformation law, 297 
Stereographic projection, 169-173 
in R” , 171 
in Hopf fibration, 282 
Inverse map in S°, 170 
Minimal surfaces, 181 
Stokes’ theorem 
Green’s theorem, 60 
in R”, 62 
In Rè , 70 
Structure constants 
of su(2) , 274 
of su(3), 316 
of Lie algebra, 262 
Surface 
Compact, 140, 231 
Definition, 95 
First fundamental form, 98, 99 
Normal, 106 
Orientable, 137 
Second fundamental form, 109 
Surface area, 111 
Surface of revolution 
First fundamental form, 102 
Geodesics, 216 
Parametric equation, 96 
Symplectic 
Matrix, 164 
Symplectic group, 235 


Tangent bundle, 2 

Tangent vector, 1-7 
Contravariant components, 5 
in R”, 2 

Taub metric, 282 

Tennis ball curve, 176 

Tensorial form 
of adjoint type, 92, 338 


INDEX 


‘Tensors Villarceau circles, 282 
Antisymmetric, 41, 46 Viviani curve, 10 
Bilinear map, 36 
Bundle, 38 Wedge product 
Components, 37, 239 2-forms, 43 
Contravariant, 37 Cross product, 44 
Metric, 38 Weierstrass elliptic function, 183 
Riemann, see Riemannian, Riemann Weierstrass patch, see Patch 
tensor Weierstrass substitution, 172 
Self-dual, 66, 295, 302 Weingarten map 
Tensor product, 37 Definition, 114 
Torsion, see Riemannian Eigenvalues, 117 
Transformation law, 239, 294 Shape operator, 114 
Theorema egregium, see Gaussian cur- Wigner D-matrices, 294, 315 
vature . 
Third fundamental form, 115 Zero section, 322 
Torsion 
of a connection, 115, 121 
Torus 


Euler characteristic, 231 
First fundamental form, 104 
Parametric equation, 96 
Total curvature, 230 
Total curvature, 230 
Transformation group 
Definition, 265 
Transition fucntions 
In R? , 94 
Transition functions 
Local coordinates, 205 
on manifold, 205 
Triangulation, 230 
Trinoid, 182 
Twistor, 235, 273, 283, 330 


Unitary group, 235 


Vaidya 

Curvature form, 212 

Metric, 209 

Ricci tensor, 213 
Vector field, 2 

Flow, 238 

Left invariant, 253 
Vector identities, 49-51 
Velocity, 12-15 
Vertical subspace, 333 


361 


Differential Geometry in Physics is a treatment of the mathematical foundations 
of the theory of general relativity and gauge theory of quantum fields. 
The material is intended to help bridge the gap that often exists between 
theoretical physics and applied mathematics. 


The approach is to carve an optimal path to learning this challenging field 
by appealing to the much more accessible theory of curves and surfaces. 
The transition from classical differential geometry as developed by Gauss, 
Riemann and other giants, to the modern approach, is facilitated by a very 
intuitive approach that sacrifices some mathematical rigor for the sake 
of understanding the physics. The book features numerous examples of 
beautiful curves and surfaces often reflected in nature, plus more advanced 
computations of trajectory of particles in black holes. Also embedded in the 
later chapters is a detailed description of the famous Dirac monopole and 
instantons. 


Features of this book: 
e Chapters 1-4 and chapter 5 comprise the content of a one-semester 
course taught by the author for many years. 
e The material in the other chapters has served as the foundation for 
many master’s thesis at University of North Carolina Wilmington for 
students seeking doctoral degrees. 
e An open access ebook is available at Open UNC (openunc.org). 
e The book contains over 80 illustrations, including a large array of 
surfaces related to the theory of soliton waves that does not commonly 
appear in standard mathematical texts on differential geometry. 


GABRIEL LUGO is an associate professor in the Department of Mathematics 
and Statistics at the University of North Carolina Wilmington. He earned 
his undergraduate and Ph. D. degrees from the University of California, 
Berkeley. Lugo has a strong interest in instructional technology in collegiate 
education, and has done extensive work in that area with colleagues in the 
math, chemistry, and computer science departments. 


SSS 
UNCW. 


UNIVERSITY of NORTH CAROLINA WILMINGTON 


Distributed by UNC Press 
WILLIAM MADISON RANDALL LIBRARY www.uncptess.org 


